COVID-19 Data Dives: The Takeaways From Seroprevalence Surveys

Natalie E. Dean, PhD


May 04, 2020

Find the latest COVID-19 news and guidance in Medscape's Coronavirus Resource Center.

Medscape has asked top experts to weigh in on the most pressing scientific questions about COVID-19, starting with serology studies. We'll have more COVID Data Dives from Dr. Dean and another expert later this week.

Natalie E. Dean, PhD

It's easier to poke holes in a study than to run one yourself. We should expect many more SARS-CoV-2 serosurveys in our future. So, in the spirit of promoting good science, here are my thoughts on best practices for the design of serosurveys.

First, it is critical to remember that serosurveys are population-level surveys. They are intended to inform our broader understanding of the disease, not to tell individuals whether they have or have not been infected. The tests are still too unreliable for the latter.

Serosurveys are particularly useful in that they allow us to reconstruct the past. We expect that many mild infections are being missed because of insufficient testing capacity, and we are confident that the vast majority of asymptomatic infections are missed. By using antibody tests to look back at who has previously been infected, we can link this with known case counts to estimate the proportion of infections detected. Some experts say that 1 out of 10 or 1 out of 20 infections are detected, though some surveys have reported higher counts.

Given that we know that many infections are being missed, we might wonder what proportion of the total population has been infected. This can be used to address questions about "herd immunity." Of note, though, we will need more evidence that having antibodies means you are immune, and that reinfection does not occur. So far, levels of antibody in populations studied have been low. We are seeing numbers only above single digits in hard-hit areas, indicating that we are far from the 60%-70% threshold for herd immunity.

We can use serosurveys to estimate the infection fatality ratio (IFR). Serosurveys let us estimate the full denominator of people infected rather than just PCR-confirmed cases. The crude case fatality ratio (CFR) is known to be an overestimate because testing is often reserved for the sickest patients. When we include mild or even asymptomatic infections, this number drops. Best estimates of IFR seem to be hovering around 0.5%-1.0%, although more data are emerging.

IFR is just an average, and we know that risk varies a lot across age groups. Another major scientific goal is to determine age-specific infection probabilities and, thus, age-specific IFR. This can be achieved if there is good survey coverage across age groups.

When thinking about designing a serosurvey to target a geographic region, it may make logistical sense to select a few smaller sub-areas to study. For example, we might select four to five areas ranging from most hard-hit (assessed by cases and deaths) to least hard-hit, to establish a range. Often we are most interested in capturing hard-hit areas because these will have the biggest numbers infected to support estimating overall and age-specific IFR.

The best survey designs are household-based with random selection of households. All persons in a selected household are invited to participate to get a broad range of ages. Including entire households also allows us to assess transmissibility within households. Household designs, while less prone to bias, are not immune as not everyone will consent to participate or be home. Essential workers, for example, may be harder to recruit than those who are sheltering in place. These studies also take more time to set up and run, and it is vital that survey teams are provided with adequate personal protective equipment when going to homes.

In a survey conducted in Miami, random digit dialing was used to recruit a representative sample of participants. Participants then agreed to visit a drive-thru testing site. Bias can still occur if people do not consent or do not have a car, but it's a creative approach.

Volunteer surveys are a type of convenience sample. Santa Clara investigators used targeted Facebook ads to recruit participants to visit drive-thru test sites. Quotas were established per zip code to limit overrepresentation. The National Institutes of Health's (NIH's) survey in Bethesda, Maryland , is also volunteer-based. Questionnaire data are collected over the phone. Participants who are NIH employees are tested on site, and others are provided with a kit for home-based blood draw.

An obvious concern with volunteer surveys is that people will preferentially enroll because they think they had COVID-19 and want confirmation. Notably, the NIH lists prior COVID-19 or current symptoms as exclusion criteria. But World Health Organization (WHO) guidance advises against excluding known cases.

For volunteer surveys, consent bias is less of a concern if you can achieve high coverage. San Miguel County, Colorado , has processed 2500 tests for roughly 8000 residents.

Another convenience sample is blood donors, though you will still need to collect questionnaire data and people may be less likely to donate when they are ill. But WHO notes that blood donors are a very eager, easy-to-follow population.

So, quick-and-dirty or slow-and-rigorous? I think both have value. The rapid though potentially biased volunteer surveys establish an order of magnitude for seroprevalence (1%, 10%, 50%?) while we await the emergence of more reliable household data.

Studies also collect valuable questionnaire data in addition to basic demographic data. Occupation, travel history, known exposure to a case, and history of clinical symptoms in the months since transmission started are all important. We can use these data to identify risk factors for infection by comparing exposures of infected and noninfected individuals. For example, we may want to identify occupations at highest risk for infection or how frequently transmission within households is occurring.

Finally, surveys are often one-time, cross-sectional studies. But in Miami, 750 new participants are recruited each week (repeated cross-sectional studies). Longitudinal studies, where the same people are sampled every 3-plus weeks, are especially valuable, allowing us to examine antibody dynamics over time to look for waning.

These are some of the key design features for serosurveys. My opinions are informed by WHO guidance on seroepidemiology, discussions with colleagues, observation of what is out there so far, and personal experience with dengue serosurveys.

Natalie Dean, PhD, is an assistant professor of biostatistics at the University of Florida in Gainesville, She specializes in emerging infectious diseases and vaccine study design. Follow her on Twitter

Follow Medscape on Facebook, Twitter, Instagram, and YouTube


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.