COVID-19 Infections and Deaths Among Connecticut Nursing Home Residents: Facility Correlates

Yue Li, PhD; Helena Temkin-Greener, PhD; Gao Shan, MS; Xueya Cai, PhD


J Am Geriatr Soc. 2020;68(9):1899-1906. 

In This Article


Data Sources and Variables for Connecticut Nursing Homes

We first obtained data on COVID-19 laboratory-confirmed cases and associated deaths in each of Connecticut's nursing homes as of April 16, 2020. These data were collected and regularly updated by the Connecticut Department of Health and Human Services ( and were also published in a recent news report.[24] These data were then linked to the Nursing Home Compare (NHC) data files (updated on March 31, 2020),[25] a data system published by the Centers for Medicare & Medicaid Services (CMS) for tracking and reporting nurse staffing, quality of care, findings of state government inspections of nursing home care practices, and other facility characteristics.[16]

We obtained the following variables in the NHC files that were potentially associated with resident safety and health outcomes:[12,13,15–22] total number of beds, average daily resident census, ownership status (for profit vs nonprofit or government owned), affiliation with a chain (yes/no), percentage of Medicare residents, percentage of Medicaid residents, average staffing level (hours per resident day) for RNs in 2019, average total nurse staffing level (including RNs, licensed practical nurses, and certified nursing assistants) in 2019, and five-star ratings for overall quality of care.

The RN and other nurse staffing levels were calculated based on daily resident census and the CMS Payroll-Based Journal system that allows nursing homes to submit electronically the number of hours that care workers (including agency and contract staff) are paid to work each day.[26] As required by Section 6106 of the Affordable Care Act, the payroll-based data are submitted quarterly on different types of nursing staff and subject to audit to ensure accuracy. The five-star ratings aggregate alternative nursing home quality measures into a rating system of one to five stars, with more stars indicating better quality.[27] Specifically, ratings were first developed to summarize three separate domains of "quality": deficiency citations assigned during annual and complaint inspections; a set of clinical outcomes of residents based on Minimum Data Set (MDS) assessments; and nurse staffing to resident ratios. The overall five-star ratings were then derived from these domain-specific ratings using a CMS-developed algorithm.[27]

We used data from the website to obtain two additional variables. These data were created by the Center for Gerontology and Healthcare Research at Brown University by combining multiple sources of resident and nursing home records such as the MDS for all nursing home residents, Medicare enrollment file, and CMS-maintained facility files. The first variable we obtained was percentage of racial and ethnic minority residents (African Americans, Hispanics, Asians or Pacific Islanders, and American Indians or Alaskan Natives) in the nursing home that was originally defined using the race and ethnicity information in the MDS and Medicare enrollment databases. We also obtained a variable for facility-level case mix, derived from the resource utilization group classification of all residents in the nursing home; the case-mix index was calculated by averaging the acuity scores of all residents (approximated by the relative staff time demanded by the resident) in the facility, with higher value indicating higher average acuity.

County COVID-19 Infections and Demographics

We downloaded data of the numbers of laboratory-confirmed cases and deaths of COVID-19 in all counties in the United States, as published by the New York Times ( These numbers have been compiled and updated in real time by the Times based on reports from state and local health agencies since the first reported coronavirus case in Snohomish County, Washington, on January 20, 2020.[28] Using these data we calculated, for each of the eight counties in Connecticut, the total number of confirmed COVID-19 cases as of April 16, 2020 (with and without nursing home confirmed cases). Similarly, we calculated the total number of COVID-19 deaths in the county, with and without nursing home deaths. Lastly, we used the 2018 Area Healthcare Resource File ( to obtain information about county population size.

Statistical Analyses

We examined the distributions of nursing home COVID-19 confirmed cases and deaths, as well as other nursing home characteristics overall, by number of confirmed cases (0, 1–10, and 11–69) and by number of COVID-19–related deaths (0, 1–5, and 6–15). We also summarized county characteristics and plotted the percentages of COVID-19 cases and deaths that were attributed to nursing home residents in each county.

Multivariable analyses determined the associations of four independent variables with numbers of confirmed cases and deaths in nursing homes (dependent variables in separate models). The four independent variables were nursing home RN staffing, overall quality of care (measured by overall five-star ratings and categorized as four- or five-star facilities vs one- to three-star facilities), concentration of Medicaid residents (categorized as facilities in the top quartile group vs other facilities), and concentration of racial and ethnic minority residents (facilities in the top quartile group vs other facilities).

We fit separate two-part models at the nursing home level to account for the fact that a relatively large number of nursing homes had zero confirmed cases and deaths.[29] The first part of the models was a generalized linear model with a logit link function and assuming binomial distribution that estimated the likelihood of a nursing home having at least one confirmed case (or death). The second part is a count model assuming a Poisson distribution that estimated the number of cases (or deaths) conditional on at least one confirmed case (or death) that was found in the nursing home as of April 16, 2020. Both parts of the model controlled for the same nursing home covariates (Table 1) as well as two county covariates: total number of confirmed cases (or deaths) in the county other than nursing home cases (deaths) and county population size. After model estimation, we obtained the predicted counts of confirmed cases and deaths for all nursing homes and plotted each predicted count against the independent variables.