Impact of the COVID-19 Pandemic on Public Health Surveillance and Survey Data Collections in the United States

Denys T. Lau, PhD; Paulina Sosa, MPH; Nabarun Dasgupta, MPH, PhD; Hua He, PhD


Am J Public Health. 2021;111(12):2118-2121. 

The COVID-19 pandemic highlighted the need for strengthened surveillance data to accurately track the distribution of infectious diseases for informing public health responses to improve infection prevention and control. Comprehensive surveillance for COVID-19 would rapidly identify infected cases, trace contacts, and monitor disease trends over time. Ongoing surveillance is also important for monitoring longer-term epidemiological trends—including infection incidence and mortality rates—across subpopulations that may be at significantly higher risk for severe disease and death, thereby improving population-specific interventions.[1] To track the progression of COVID-19, we inevitably ask the question: is a unified national surveillance system needed to respond effectively to this pandemic and future public health emergencies?

The answer may be unexpectedly complex when considering the different aspects of the pandemic that need to be tracked. In the United States, to monitor COVID-19–associated cases and deaths, complete census data at the aggregate and individual levels are gathered from separate systems. For example, COVID-19 cases (which may include death data) can come from notifiable infectious disease systems[2] and all deaths from vital statistics systems.[3] The most accurate death counts come from death certificates. To track hospitalizations with suspected and confirmed COVID-19 cases, for instance, data on all cases from more than 6000 hospitals are reported weekly and compiled in the Unified Hospital Time-Series Dataset by federal public health agencies.[4] Also used are commercial databases of health insurance claims and electronic health records from stand-alone hospital systems.

However, to better understand the epidemiology of COVID-19 among in-care populations, more information about the care episode, patient, and provider is often needed than what is available in highly structured and coded data gathered via standard surveillance reporting. Instead of relying on surveillance systems that are designed to provide near real-time data, collecting sampled data is a necessary alternative for gathering more in-depth information, even if these data are slower to process and are collected in selected geographic areas. Ideally, sampled data are collected from a representative subset of a population that would allow statistical estimates to be produced and inferences to be made from the sampled data to the population as a whole. As examples, in-depth data from claims and electronic health records are electronically extracted from a representative sample of hospitals through the National Hospital Care Survey[5] and a sample of patient records is abstracted in hospitals in selected states through the COVID-19–Associated Hospitalization Surveillance Network (COVID-NET).[6]

Because many individuals with COVID-19 can be asymptomatic or exhibit mild symptoms like those of a common cold, millions of Americans may have undiagnosed infections.[7] To more fully understand the epidemiology and burden of the pandemic, information about undiagnosed COVID-19 cases is needed. A national sample survey that has the capacity to conduct antibody tests, such as the National Health and Nutrition Examination Survey,[8] can offer additional information. Similar population-based COVID-19 seroepidemiological surveys are conducted in other countries.[9] Alternatively, state and local-level surveys may produce more precise estimates on information about the pandemic experience specific to geographic areas, such as the California Health Interview Survey[10] and the New York City Department of Health surveys, among others. Indeed, many insightful findings have recently originated from single locations.[11,12]

Given the existing fragmented data collection systems, Hennessee et al. (p. 2127) describe how this complex infrastructure has led to drastic variations in surveillance practices, for example, in the analysis and public reporting of newly confirmed COVID-19 cases. On the other hand, building a single surveillance and survey infrastructure may be a tall order. Some challenges come to mind. Public health data, including case definitions of "suspected" and "confirmed" COVID-19 disease, are not collected or recorded in a standardized manner across all systems. Many systems do not allow for metadata on the type of test used to detect infection (e.g., antigen, polymerase chain reaction, and antibody). Data systems are not interoperable and lack common data vocabulary to allow seamless data exchange. Adequate data privacy, protection, and security need to be improved and put in place to promote public confidence. And currently there exists no unique national patient identifier to facilitate data linkage across systems to track progression of COVID-19 at the patient level over time. One step toward a single repository to compile multiple data sources is the Centers for Disease Control and Prevention (CDC) COVID Tracker, a "one-stop shop" for visualizing data from core surveillance and survey systems to share critical COVID-19–related information.[13] The CDC also recently created a new National Center for Epidemic Forecasting and Outbreak Analysis, which will forecast and track hotspots for COVID-19 and other emerging public health threats.[14]

In the meantime, a key objective is to ensure that accurate, reliable, and timely data continue to be produced from existing surveillance and survey systems, ranging from vital statistics and health care encounter data to population-based surveys that include interviews or physical examinations. In this issue, AJPH asks those who conduct some of the nation's long-standing surveillance and survey programs how COVID has affected their operations and what design modifications have been made to continue collecting data and perhaps even to expand their data collection in response to the pandemic.

  • Mortality Data. To track the impact of COVID-19 on US mortality, Ahmad et al. (p. 2133) describe how, within weeks of the first reported US cases, the National Center for Health Statistics (NCHS) made unprecedented strides to successfully develop death record certification guidance, adjust internal data processing systems, modernize vital statistics systems to increase interoperability, and quickly stand up a system to release daily updates of COVID-19 death counts.

  • National Health Care Surveys. Ward et al. (p. 2141) describe how, during COVID-19, survey operations had to be quickly modified to continue collecting the nation's data in ambulatory, hospital, and long-term care settings. For example, all in-person onsite interviews and health record abstraction were halted and replaced by telephone interviews. New COVID-19–related items were added regarding providers' experiences in delivering care during the pandemic, including telemedicine visits, shortages of personal protective equipment, inability to care for patients who tested positive for COVID-19, and knowledge of fellow providers or staff in their practice testing positive for COVD-19.

  • National Health and Nutrition Examination Survey (NHANES). Paulose-Ram et al. (p. 2149) describe how NHANES was suspended for a period of time because of COVID-19 and was able to resume operations in mid-2021. The newly designed NHANES 2021–2022 survey has changed its field operations to safely collect data at participants' homes and in mobile examination centers while adding new items on COVID-19, most notably, antibody testing that will provide data to produce national estimates on both natural infection and vaccine-induced immunity to the COVID-19 virus.

  • Medical Expenditure Panel Survey (MEPS). Zuvekas and Kashihara (p. 2157) describe how the MEPS successfully responded to challenges posed by COVID-19 by reengineering its field operations to complete data collection without in-person interviews and maintain data release schedules. Several enhancements were made to MEPS—such as adding survey items on telehealth visits, delays in accessing care because of COVID-19, and social determinants of health—to allow research on COVID-19's impact on health care consumers, employers, and the US health care system.

  • National Health Interview Survey (NHIS). Blumberg et al. (p. 2167) describe how the NHIS responded to COVID-19 challenges with operational changes to continue production in 2020. Because of expected delay in releasing the 2020 NHIS data files, the NCHS turned to two new online data collection platforms: the NCHS Research and Development Survey and the Census Bureau's Household Pulse Survey. The latter shows how a new rapid response survey can be launched expediently by an intergovernmental cooperative effort to assess the impact of the pandemic on individuals and households.

  • California Health Interview Survey (CHIS). Ponce et al. (p. 2122) describe how the CHIS navigated challenges posed by COVID on data collection from a representative sample of California's adults, adolescents, and children; integration of new COVID-19–related modules, particularly items specific to anti-Asian rhetoric and hate incidents targeting Asian, Native Hawaiian, and Pacific Islander communities; new monthly releases of preliminary COVID-19 data through a dashboard; and future implications of findings from this period of data collection.

  • New York City Health Surveys. Levanon Seligson et al. (p. 2176) describe how New York City's Department of Health and Mental Hygiene has rapidly changed its existing surveys, such as the long-standing Community Health Survey, and added new ones like the Healthy NYC and the SARS-CoV-2 serosurvey to better understand the impact of the pandemic on physical health, mental health, and social determinants of health among New York City residents. Furthermore, seven New York City Health Opinion Polls were conducted in one year between March 2020 and March 2021 to collect information on COVID-19–related knowledge, attitudes, and opinions, including vaccine intentions.

Collectively, these national, state, and city surveillance and survey programs have demonstrated agility, resilience, innovation, and commitment in their efforts to meet their mission while incorporating new COVID-19–related items to monitor the pandemic and implementing new data collection, processing, and dissemination plans to release data in an even more timely manner. As more data become available, we will be able to further examine the impact on data quality from changes made to the nation's surveillance and survey systems, as well as the fuller extent and impact of COVID-19 on the health of the nation. This collection of experiences is expected to assist future surveillance and survey managers in pandemic contingency planning.