Basu, A. (2020) - Estimating The Infection Fatality Rate Among Symptomatic COVID-19 Cases

10/7/2020 Estimating The Infection Fatality Rate Among Symptomatic COVID-19 Cases In The United States | Health Affairs
   
RESEARCH ARTICLE COVID-19

HEALTH AFFAIRS > VOL. 39, NO. 7: FOOD, INCOME, WORK & MORE
Estimating The Infection Fatality Rate Among

Symptomatic COVID-19 Cases In The United
States
Anirban Basu
AFFILIATIONS

Anirban Basu (basua@uw.edu) is the Stergachis Family Endowed Director and Professor of Health Economics
at the Comparative Health Outcomes, Policy, and Economics Institute in the School of Pharmacy, University of
Washington, in Seattle.
PUBLISHED: MAY 07, 2020  Free Access https://doi.org/10.1377/hlthaff.2020.00455

 View Article
ABSTRACT
Knowing the infection fatality rate (IFR) of novel coronavirus (SARS-CoV-2) infections is
essential for the ght against the coronavirus disease (COVID-19) pandemic. Using data
through April 20, 2020, I t a statistical model to COVID-19 case fatality rates over time at
the US county level to estimate the COVID-19 IFR among symptomatic cases (IFR-S) as
time goes to in nity. The IFR-S in the US was estimated to be 1.3 percent. County-speci c
rates varied from 0.5 percent to 3.6 percent. The overall IFR for COVID-19 should be
lower when I account for cases where patients are asymptomatic and recover without
symptoms. When used with other estimating approaches, my model and estimates can
help disease and policy modelers obtain more accurate predictions for the epidemiology
of the disease and the impact of various policy levers to contain the pandemic. The
model could also be used with future pandemics to get an early sense of the magnitude
https://www.healthaffairs.org/doi/full/10.1377/hlthaff.2020.00455 1/22
of symptomatic infection at the population level before other direct estimates are
available. Substantial variation across patient demographics likely exists and should be
the focus of future studies.
TOPICS
COVID-
19 | CORONAVIRUS | DISEASES | PANDEMICS | EPIDEMIOLOGY | POPULATIONS | PATIEN
T TESTING | MORTALITY | COMORBIDITY | CODE OF FEDERAL REGULATIONS
Knowing the infection fatality rate (IFR) of novel coronavirus (SARS-CoV-2) infections is
essential for the ght against the coronavirus disease (COVID-19) pandemic.1,2 A
substantial amount of uncertainty in projecting the effects of the pandemic at the
population level and the impact of public policies and directives, such as physical
distancing measures, as well as the impact of potential future shortages of health care
supply pivots around the uncertainty of this parameter. The IFR is the ratio of two
numbers: the number of deaths caused by COVID-19 (numerator) and the total number of
people in the population who were genuinely infected by the virus (denominator).
However, for many reasons, both the numerator and the denominator of the IFR are
measured with error. For example, errors in the denominator arise because patients
remain asymptomatic during the rst few days of the infection, testing is not universal
and is selective at best, and longitudinal data on patients with COVID-19 are unavailable
at the national level.3 Measurement errors may also exist in the numerator because of
the undercounting of deaths due to social isolation and other factors and because some
COVID-19-related deaths are attributed to other factors.4 As a consequence, the reported
case fatality rate (CFR) for COVID-19, which is an estimate based on the reported number
of COVID-19-related deaths and the reported number of cases that were laboratory
con rmed as COVID-19 infections, provides a biased estimate of the IFR. It could be
biased upward because the actual number of individuals who are infected is not known.
It also could be biased downward because some of those who are currently infected
could die in the future or because deaths are undercounted. The upward bias is likely to
be much larger during the early phase of testing. Most estimates of the COVID-19 fatality
rate currently available around the world suffer from these biases.5
In this article I try to overcome these biases using national US data on counts of reported
deaths and detected COVID-19 cases and the temporality of the reported case fatality
rate (that is, its variability over time) to make inferences about the infection fatality rate
for COVID-19. My method does not account for a fraction of cases with COVID-19
infection where patients recover without any major symptoms. These asymptomatic
patients do not contribute to any of the reported statistics on COVD-19 deaths and
cases. A true IFR should include these patients in the denominator. However, in this
article, because I try to eliminate measurement errors in reported CFRs based on trends
in reported COVID-19 deaths and cases, I am unable to account for this fraction of the
population that remains asymptomatic with infections. As a consequence, what I
estimate is the IFR among symptomatic COVID-19 cases (IFR-S) or the “true” case
fatality rate, where no reporting errors are present.
STUDY DATA AND METHODS

ASSUMPTIONS
I make three assumptions for this analysis. First, errors in the numerator and the
denominator lead to underreporting of true COVID-19 deaths and cases, respectively, and
the error is smaller for deaths than for cases. Second, both the errors are declining over
time. Finally, the errors in the denominator are declining at a faster rate than the error in
the numerator.
The rst assumption is self-evident: both deaths and actual cases are undercounted
during the initial phase of the epidemic.3,4 Because deaths are much more visible events
than infections, which, in the case of COVID-19, can be asymptomatic during the rst few
days of infection, I posit that at any point in time, the errors in the denominator are larger
than the errors in the numerator. Hence, this assumption leads to CFR estimates being
larger than the IFR-S, which is typically believed to be true, according to observed data.
The second assumption is my central assumption, stating that under some stationary
processes of care delivery, health care supply, and reporting, which are all believed to be
improving over time, the errors in both the numerator and the denominator are declining.
It implies that the measurement of both the numerator and the denominator are
improving over time, albeit at different rates in different jurisdictions.
The third assumption posits that the error in the denominator is declining faster than the
error in the numerator. This assumption indicates that case fatality rates, based on the
number of cumulative COVID-19 deaths and the number of cumulative reported COVID-
19 cases, are declining over time and are con rmed by my observed data (described in
detail below).
If these simple assumptions hold, these methods allow me to project the IFR at the limit
when time goes to in nity and errors reduce to zero. That is not to say that, in practicality,
I expect that in the future the US would ever reach a point of universal testing or
comprehensive reporting of all COVID-19 deaths. However, as long as improvements

occur in identifying all symptomatic cases and reporting all COVID-19 deaths, the errors
will be reduced, enabling inferences about the IFR-S by tting models to the temporality
of the CFR and thereby projecting the expected rate at in nite time.
However, this stationary process of declining errors may be disrupted in certain regions
and over certain periods by shortages of testing supplies, leading to arti cial increases in
the CFR after days of decline. I applied speci c criteria to identify these regions and
periods so that I could exclude these data from my analysis.
The online appendix contains a detailed mathematical formulation of these assumptions

and their implications.6 I used an exponential decay formulation within a logit framework
to model this decay to estimate the asymptote parameter. In other words, given the fact
that reported case fatality rates are observed to decline over time early in an epidemic, I
was able to use statistical techniques to estimate the potential value of those reported
rates as they approach their limit when the decline continues over a long time horizon.
This rate at the limit presumably re ects the value of the IFR-S. More important, I allowed
for heterogeneity in decay across US counties to estimate my target parameter. My
estimate of IFR-S is less sensitive to the missing deaths that will occur in the future from
the cases detected in the last two weeks of the limit (discussed in more detail in the
Discussion section). I validated my predictions on the basis of declining observed rates
during the future dates that were not used for estimation purposes.
DATA
I used publicly reported data, located in GitHub, from both the Johns Hopkins
Repository7 and the New York Times8 on the total number of cumulative deaths and
detected cases by day for each US county. I updated missing values from one repository
using the nonmissing values from the other repository by date and county. Moreover, for
any date and county, the maximal value reported for deaths or detected cases in either
repository was used. A rate variable was constructed by dividing the cumulative total
number of deaths by the cumulative total number of detected cases for each date and
county. The rst diagnosed case of COVID-19 in the US occurred January 21, 2020, and
the rst death occurred February 28, 2020, both in Washington State, although new data
are showing that earlier cases may have existed in California.9 Because testing was
nonexistent during the initial few days, the data showed that the ratio of deaths to cases
increased for the rst few days for many counties. Therefore, for each county, my
analysis started from the day when the rst zenith in this rate was reached. It is assumed
that declining error rates within each county began from that day forward, driven by
better reporting of COVID-19 deaths and cases. Only counties that had reported at least
ve COVID-19 deaths and thirty cases before April 20, 2020, were retained.
Moreover, I was aware that sudden areawide shortages in testing kits could arti cially
raise the CFR after days of decline, and therefore bias my decay analysis. That is why I
also removed counties that reported at least a one-standard-deviation increase in the
CFR for seven or more days after reaching the CFR nadir. In addition, among the
remaining counties, I removed the last seven days of follow-up if CFRs were found to
increase consecutively for three or more days during that week. Last, I retained counties
that had at least six follow-up days of reported data after reaching the zenith.
STATISTICAL MODEL
I modeled these rates over time for each county, using a binomial model for the counts of
deaths over the counts of detected cases; that is, Deathsjt are distributed as binomial(pjt,
Detectedjt), where j denotes the counties and t denotes the number of days from the
zenith value of rate within a county (Days). The mean of this binomial model, pjt,
represents the probability of death and is expressed as a Bayesian random coe cients
exponential decay model within a logit link framework, so that the predicted rates remain
within 0 and 1. Speci cally, my mean model estimated the probability of death in county j
at time t: (pjt=Logit−1(A1j+(A2j−A1j)×exp(−exp(A3j)(Daysjt−1))). The Aij represent the actual
death rates for speci c counties.
The main feature of the decay model is that as time (Days) goes to in nity, under the
assumption that errors in both the numerator and the denominator go to zero, the
cumulative reported CFR would approach an estimate of the true IFR-S in the population.
Speci cally, in this model: Logit−1(A1j) = county-speci c IFR-S, as Days goes to in nity;
Logit−1(A2j) = county-speci c expected zenith rate, when Days = 1; and −exp(A3j) =
county-speci c exponential decline rate in the CFR, parameterized such that it takes on
negative values only.
The overall US-speci c IFR-S can be expressed as Logit−1(b1), where A1j is distributed as
normal(b1, 1). Hyperpriors for coe cients were based on Cauchy distributions, as
recommended in the Bayesian literature for logistic models.10 Prior sensitivity analyses
were carried out based on using normal or uniform distribution for the hyperpriors.
Further details about the model are in the appendix.6 I used the Metropolis-Hastings
algorithm to estimate this model, using three simultaneous Monte Carlo chains and
10,000 deviates for each chain, 10,000 burn-in runs, and a thinning of 100.
I used data up to April 20, 2020, for my training sample to estimate my model. Model t
was assessed using posterior predictions from the model against four consecutive
follow-up days for each county. For most counties, these days were April 21–24, 2020.
LIMITATIONS
There were several limitations to my analysis. First, I acknowledge that my estimate of

the IFR-S would be higher than the true overall IFR. This is because my model relied on
identi ed cases that are presumably all symptomatic patients with COVID-19. Therefore,
even at the limit, my estimated rate would not include the fraction of patients who may
have the infection but who remain asymptomatic and recover. My estimate would,
however, include patients who start with an asymptomatic infection but become
symptomatic later. An estimate of the magnitude of the truly asymptomatic fraction in
COVID-19 remains unclear: populationwide antibody testing would be needed to
establish this statistic. Results from serotesting from the Diamond Princess cruise ship
outbreak suggests that about 17.9 percent of infected people never developed
symptoms.11 As a consequence, a reasonable estimate of the overall IFR would be about
20 percent lower than my estimated IFR-S.
Second, my estimated COVID-19 IFR-S may be slightly conservative. My approach was to

control for the upward bias in this estimate if raw rates were used. I did not control for
the downward bias that may arise because some of the detected cases may become
deaths in the future. Recently, Nick Wilson and colleagues attempted to address this
downward bias by estimating lagged death rates based on international data; the
estimated effect ranged from 0.8 percent in China (excluding Hubei Province) to
4.2 percent in eighty-two other countries and territories.12 That analysis used a time lag
of thirteen days,12 based on reported data from China on the time from radiologic
con rmation of COVID-19 to death.13 However, as the distribution of time to death varies
over this thirteen-day follow-up period, such a correction could give nonsensical results
when applied to US data that include less than sixty days of COVID-19 history. The death
rate was estimated to be higher than 1 on speci c days for many US counties when such
a correction was applied. In general, I believe that the downward bias generated as a
result of the missing deaths at the limit should be small. This is because my estimate
represents the death rate in an asymptote, which is what would happen with many days
of accumulated data on the number of detected and death cases. At that point, the
additional number of deaths from that last two weeks of detected cases would
contribute very little to my overall estimate of the IFR-S.
Third, what I present here are crude IFR-Ss, and not even age-adjusted ones. I did not
have any data to assess the distribution of IFR-S across age and comorbidity pro les of
patients. One would need, ideally, individual-level data and, at the least, group-speci c
data to estimate such dispersion; these data are not publicly available.14,15 The Centers
for Disease Control and Prevention reports signi cant variation in fatality rates by age
groups.16 Further work is required on this front.
STUDY RESULTS
Of 3,020 US counties, 1,364 counties reported any con rmed COVID-19 case by April 20,
2020. Of these counties with con rmed cases, 134 reported no COVID-19 deaths until
that time; 1,034 counties had any reported COVID-19 deaths by April 20; and 397
counties reported exactly one COVID-19 death by this date. By April 20 there were
753,113 con rmed COVID-19 cases and 41,287 reported COVID-19 deaths. My analysis
included 116 counties. Interestingly, I did not include New York County, New York
(Federal Information Processing System code 36061, which does not represent all of
New York City) in my analysis, despite its having the highest number of cases and deaths
in the country. The number of deaths in this county was rising at a faster rate than the
number of detected cases until April 20, 2020; hence, the case fatality rate had not
reached a zenith. Overall, a total of 40,835 con rmed cases and 1,620 con rmed deaths
until April 20 were used for my analysis (see the appendix).6
The 116 counties selected spanned 33 states, with Georgia contributing the maximum
with 13 counties, followed by Louisiana with 9 and then South Carolina with 8. After
reaching their initial zenith, CFRs were found to be declining within each of these retained
counties, supporting my assumptions about the differential declining error rate between
the numerator and denominator of the CFRs for these counties. The appendix contains a
description of growth in COVID-19 reported cases and deaths and the decline in the
CFRs.6 At the zenith of the computed rate in each county, the rate variable varied from
1.7 percent to 33.3 percent. By the end of follow-up, the rate varied from 0.9 percent to
19.3 percent across counties. The number of follow-up days ranged from seven to thirty-
one (see the appendix).6
The Bayesian model showed good convergence and mixing properties between the
model and the observations. Gelman-Rubin statistics were below 1 for each of the
parameters of the model, indicating that the three independent Monte Carlo chains
overlapped and converged to similar posterior distributions for the parameters. The
appendix presents these results,6 including residual analysis based on tted posterior
means (means predicted by the model for the period before the validation phase) from
my prediction model, which appears to t the county-level data well over time. The
posterior mean of the US-speci c IFR-S was estimated to be 1.3 percent (median,
1.3 percent; standard deviation: 0.4), with a 95% central credible interval of 0.6–2.1
(exhibit 1).
Exhibit 1 Estimated COVID-19 infection fatality rates among symptomatic patients

(IFR-S) for the twenty counties examined with the lowest rates plus the US overall
SOURCE Authors’ analysis of publicly available data on COVID-19 counts of cases and deaths.
NOTE Point estimates are posterior means; bars are 95% central credible intervals.
The posterior means and the 95% central credible intervals of county-speci c IFR-Ss for
the twenty counties I examined with the lowest rates (0.5–1.4 percent) plus the overall
values for the US and the twenty-one counties I examined with the highest rates (2.3–
3.6 percent) are shown in exhibits 1 and 2, respectively. The IFR-S for other counties in
the middle that are not shown ranged from 1.5 percent to 2.2 percent and are described
in the appendix.6 The lowest rate was estimated to be in Putnam County, New York
(0.5 percent; 95% central credible interval, 0.1–1.0), whereas the highest was estimated
to be in King County, Washington (3.6 percent; 95% central credible interval, 0.5–6.1).
Data at the county level are still evolving, and hence considerable uncertainty exists for
some counties, especially toward the higher range of IFR-S estimates. Because these
estimates represent the crude IFR-S, many factors contribute to their variation across
counties, including demographics (especially age distribution), levels of population
health, and supply of health care services. In that sense, the IFR-S is a dynamic quantity
even within a county, depending on how the case-mix of the infected population shifts
over time.
Exhibit 2 Estimated COVID-19 infection fatality rates among symptomatic patients

(IFR-S) for the twenty-one counties examined with the highest rates
NOTE Point estimates are posterior means; bars are 95% central credible intervals.
To assess the validity of my prediction model, I forecasted county-speci c reported case

fatality rates based on the posterior predictive mean over the course of four days after
the estimation time window for each county and compared those rates with observed
rates during these days.17Exhibit 3 presents the comparison between predicted and
observed case fatality rates, with each dot representing a rate for a county on a given
day. The exhibit shows that the 95% central credible intervals from the posterior
predictive distribution from the model were able to capture the true CFRs (represented by
the 45-degree diagonal line) for all counties over these four days. The Bayesian posterior
predictive two-sided p values18,19 were less than 0.05 for none of the 116 counties for
any of the four days.
Exhibit 3 Predicted COVID-19 case fatality rates by county versus observed rates for
the rst four consecutive dates that were not used for estimation
NOTES Results from Bayesian mixed-effects nonlinear model. Each symbol represents the
point estimate of a case fatality rate (CFR) for a county on a given day; bars are 95% central
credible intervals.
DISCUSSION
After I modeled the available national data on cumulative deaths and detected COVID-19
cases in the United States, the symptomatic infection fatality rate from COVID-19 was
estimated to be 1.3 percent. This estimated rate is substantially higher than the
approximate IFR-S of seasonal in uenza, which is about 0.1 percent20 (34,200 deaths
among 35.5 million patients who got sick with in uenza). In uenza is also believed to be
completely asymptomatic in 16 percent of the infected population,21 and this fraction is
not included in the calculation of its IFR-S.22 My COVID-19 IFR-S estimate is not outside
the ballpark of estimates becoming available from other countries, but it is certainly
lower, as is expected from addressing the upward bias in those estimates. For example,
the COVID-19 fatality rate for China (without correction for the upward bias inherent in
looking at observed rates) was initially reported to be 5.6 percent (95% CI, 5.4–5.8).23 By
February 20, 2020, however. the crude fatality rate for China was estimated to be
3.8 percent.24 The fatality rate outside China was estimated to be 15.2 percent (95% CI,
12.5–17.9),23 which may be due to the more considerable upward bias during the
beginning part of the pandemic within a country. The same patterns occur in the United
States, with observed rates being much higher during the initial part of the pandemic. A
recent estimate of the CFR using individual-level data from Wuhan residents and from
international Wuhan residents who repatriated on six ights found it to range from
0.66 percent to 1.4 percent.25
In a thought experiment in which 35.5 million people contract COVID-19 this year in the
US (that is, the same number as were infected with in uenza last year),20 then in the
absence of any mitigation strategies or distancing behaviors and with the supply of
health care services under typical conditions, my IFR-S estimate predicts that there
would be nearly 500,000 COVID-19 deaths in the US in 2020. To the extent that COVID-19
is more infectious than in uenza and that we have no protection in the form of a vaccine
or treatment, the number of infections—and hence the number of deaths—would be
higher compared to in uenza. Certainly, with the implementation of mitigation strategies,
the death toll will be lower. For example, the March 31 White House Coronavirus Task
Force projections of 100,000–200,000 deaths from COVID-19 in 2020 were made using
assumptions about the effectiveness of distancing directives and measures currently in
place.26
“Constraints in the supply of health care

services could surely increase the
symptomatic infection fatality rate and
the overall fatality rate.”
My estimated IFR-S applies under the assumption that the current supply (up until
April 20) of health care services, including hospital beds, ventilators, and access to
providers, would continue in the future. Constraints in the supply of health care services
could surely increase the symptomatic infection fatality rate and the overall fatality rate. I
hope that simulations to understand and forecast the effect of such shortages can be
improved, using my estimates of IFR-S as the baseline.27
Similarly, my estimates of the COVID-19 IFR-S in the US can help disease and policy
modelers obtain more accurate predictions for the epidemiology of the disease and the
impact of alternative policy levers to contain this pandemic.
ACKNOWLEDGMENTS
Anirban Basu received compensation from Salutis Consulting LLC. No funding was
received for this analysis. The author thanks Varun Gandhay for excellent research
assistance for this work. He also thanks six anonymous reviewers and Donald Metz of
Health Affairs for their excellent comments. The views expressed do not represent those
of the University of Washington or the National Bureau of Economic Research. An
unedited version of this article was published online May 7, 2020, as a Fast Track Ahead
Of Print article. That version is available in the online appendix.
NOTES
1
Ferguson NM, Laydon D, Nedjati-Gilani G, Imai N, Ainslie K, Baguelin Met al. Report 9: Impact of non-
pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand [Internet]. London:
Imperial College London; 2020 Mar 16 [cited 2020 May 4]. Available from:
https://www.imperial.ac.uk/media/imperial-college/medicine/sph/ide/gida-fellowships/Imperial-College-
COVID19-NPI-modelling-16-03-2020.pdf Google Scholar
IHME COVID-19 Health Services Utilization Forecasting Team, Murray CJL. Forecasting COVID-19 impact on
hospital bed-days, ICU-days, ventilator-days, and deaths by US state in the next 4 months. MedRxiv [serial on
the Internet]. 2020 Mar 30 [cited 2020 May 4]. Available from:
https://www.medrxiv.org/content/10.1101/2020.03.27.20043752v1 Google Scholar
Li R, Pei S, Chen B, Song Y, Zhang T, Yang Wet al. Substantial undocumented infection facilitates the rapid
dissemination of novel coronavirus (SARS-CoV-2). Science [serial on the Internet]. 2020 May 1 [cited 2020
May 28]. Available from: https://science.sciencemag.org/content/368/6490/489 Crossref, Google Scholar
Brown E, Tran AB, Reinhard B, Ulmanu M. U.S. deaths soared in early weeks of pandemic, far exceeding
number attributed to covid-19. Washington Post [serial on the Internet]. 2020 Apr 27 [cited 2020 May 4].
Available from: https://www.washingtonpost.com/investigations/2020/04/27/covid-19-death-toll-
undercounted/ Google Scholar
There are three speci c fatality rates to consider. First is the “true” case fatality rate, or the infection fatality
rate among symptomatic patients (IFR-S)—the proportion of patients who die after falling sick from the
infection. Second is the reported case fatality rate (CFR)—the proportion of patients who die after being
detected to fall sick from the infection. Third is the overall infection fatality rate (IFR)—the proportion of
patients who die among all those who are infected even though some of them may never show symptoms.
To access the appendix, click on the Details tab of the article online.
GitHub. Johns Hopkins University Covid-19 Data Repository: CSSEGISandData/COVID-19. GitHub [serial on
the Internet]. c 2020 [cited 2020 May 4]. Available from: https://github.com/CSSEGISandData/COVID-
19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports Google Scholar
GitHub. New York Times Covid-19 Data Repository. GitHub [serial on the Internet]. c 2020 [cited 2020 May 4].
Available from: https://github.com/nytimes/covid-19-data Google Scholar
County of Santa Clara, Emergency Operations Center [Internet]. Santa Clara County (CA): Santa Clara County
Public Health; 2020. Press release, County of Santa Clara identi es three additional early COVID-19 deaths;
2020 Apr 21 [cited 2020 May 4]. Available from: https://www.sccgov.org/sites/covid19/Pages/press-release-
04-21-20-early.aspx Google Scholar
10
Gelman A, Jakulin A, Pittau MG, Su Y-S. A weakly informative default prior distribution for logistic and other
regression models. Ann App Stat. 2008;2(4):1360–83. Crossref, Google Scholar
11
Moriarty LF, Plucinski MM, Marston BJ, Kurbatova EV, Knust B, Murray ELet al. Public health responses to
COVID-19 outbreaks on cruise ships—worldwide, February–March 2020. MMWR Morb Mortal Wkly Rep.
2020;69(12):347–52. Crossref, Medline, Google Scholar
12
Wilson N, Kvalsvig A, Barnard LT, Baker MG. Case-fatality risk estimates for COVID-19 calculated by using a
lag time for fatality. Emerg Infect Dis. 2020;26(6):1339–441. Crossref, Medline, Google Scholar
13
Yang X, Yu Y, Xu J, Shu H, Xia J, Liu Het al. Clinical course and outcomes of critically ill patients with SARS-
CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med.
2020;8(5):475–81. Crossref, Medline, Google Scholar
14
Richardson S, Hirsch JS, Narasimhan M, Crawford JM, McGinn T, Davidson KWet al. Presenting
characteristics, comorbidities, and outcomes among 5700 patients hospitalized with COVID-19 in the New
York City area. JAMA [serial on the Internet]. 2020 Apr 22 [cited 2020 May 4]. Available from:
https://jamanetwork.com/journals/jama/fullarticle/2765184 Google Scholar
15
Myers LC, Parodi SM, Escobar GJ, Liu VX. Characteristics of hospitalized adults with COVID-19 in an
integrated health care system in California. JAMA [serial on the Internet]. 2020 Apr 24 [cited 2020 May 4].
Available from: https://jamanetwork.com/journals/jama/fullarticle/2765303 Google Scholar
16
CDC COVID-19 Response Team. Outcomes among patients with Coronavirus disease 2019 (COVID-19)—
United States, February 12–March 16, 2020. MMWR Morb Mortal Wkly Rep. 2020;69(12):343–6. Crossref,
Medline, Google Scholar
17
Gelman A, Meng X-L , Stern H. Posterior predictive assessment of model tness via realized discrepancies.
Statistica Sinca. 1996;6:733–807. Google Scholar
18
Rubin DB. Bayesian justi able and relevant frequency calculations for applied statistician. Ann Stat.
1984;12:1151–72. Crossref, Google Scholar
19
Gelman A. Two simple examples for understanding posterior p-values whose distributions are far from
uniform. Electron J Stat. 2013;7:2595–602. Crossref, Google Scholar
20
Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases.
Estimated in uenza illnesses, medical visits, hospitalizations, and deaths in the United States—2018–2019
in uenza season [Internet]. Atlanta (GA): CDC; 2020 Jan 8 [cited 2020 May 4]. Available from:
https://www.cdc.gov/ u/about/burden/2018-2019.html Google Scholar
21
Leung NHL, Xu C, Ip DKM, Cowling BJ. Review article: The fraction of in uenza virus infections that are
asymptomatic: a systematic review and meta-analysis. Epidemiology. 2015;26(6):862–72. Crossref, Medline
, Google Scholar
22
Reed C, Angulo FJ, Swerdlow DL, Lipsitch M, Meltzer MI, Jernigan DBet al. Estimates of the prevalence of
pandemic (H1N1) 2009, United States, April–July 2009. Emerg Infect Dis. 2009;15(12):2004–7. Crossref,
Medline, Google Scholar
23
Baud D, Qi X, Nielsen-Saines K, Musso D, Pomar L, Favre G. Real estimates of mortality following COVID-19
infection. Lancet Infectious Diseases [serial on the Internet]. 2020 Mar 12 [cited 2020 May 28]. Available
from: https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30195-X/fulltext Google Scholar
24
Bethesda, Maryland 20814

T 301 656 7401
F 301 654 2845
customerservice@healthaffairs.org
Terms and conditions Privacy Project HOPE
Health Affairs is pleased to offer Free Access for low-income countries, and is a signatory to the DC principles for Free Access to
Science. Health Affairs gratefully acknowledges the support of many funders.
Project HOPE is a global health and humanitarian relief organization that places power in the hands
of local health care workers to save lives across the globe. Project HOPE has published Health Affairs
since 1981.
Copyright 1995 - 2020 by Project HOPE: The People-to-People Health Foundation, Inc., eISSN 1544-
5208.
World Health Organization. Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19)
[Internet]. Geneva: WHO; 2020 Feb 16–24 [cited 2020 May 4]. Available from:
https://www.who.int/docs/default-source/coronaviruse/who-china-joint-mission-on-covid-19- nal-report.pdf
Google Scholar
25
Verity R, Okell LC, Dorigatti I, Winskill P, Whittaker C, Imai Net al. Estimates of the severity of coronavirus
disease 2019: a model-based analysis. Lancet Infectious Diseases [serial on the Internet]. 2020 Mar 30 [cited
2020 May 4]. Available from: https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30243-
7/fulltext Google Scholar
26
Resnick B. The White House projects 100,000 to 200,000 Covid-19 deaths. Vox [serial on the Internet]. 2020
Mar 31 [cited 2020 May 4]. Available from: https://www.vox.com/science-and-
health/2020/3/31/21202188/us-deaths-coronavirus-trump-white-house-presser-modeling-100000
Google Scholar
27
University of Washington, School of Pharmacy, Comparative Health Outcomes, Policy, and Economics
Institute. The CHOICE Institute COVID-19 platform [Internet]. Seattle (WA): CHOICE Institute; [cited 2020 Jun
15]. Available from: https://uwchoice.shinyapps.io/covid/ Google Scholar
Health Affairs Comment Policy

Comment moderation is in use. Please do not submit your comment twice -- it will appear shortly.
Please read our Comment Policy before commenting. 
25 Comments Health Affairs 🔒 Disqus' Privacy Policy 
1 Login
 Recommend t Tweet f Share Sort by Best
Join the discussion…
LOG IN WITH
OR SIGN UP WITH DISQUS ?
Name
Ken Bomant • 5 months ago

Is there no international standard nomenclature for epidemiology? It's crazy! This is what
cultivates the conspiracy theories which have gone off the chart!!. Most sites call this the "Case
Fatality Rate" which should at least be "Symptomatic Case Fatality Rate" [SCFR]. There "Infection
Fatality Rate" includes asymptomatic infections There needs to be clear UNAMBIGUOUS
Fatality Rate includes asymptomatic infections. There needs to be clear, UNAMBIGUOUS,
DESCRIPTIVE terms generated & agreed upon for the statistics! ...and there must be no
requirement for the public to read small little subtitles explaining the stat. If the name needs to be
long, then so be it, .... an acronym could be generated.
6△ ▽ • Reply • Share ›
ANB2017 > Ken Bomant • 4 months ago

That's what struck me, too. IFR is based on random serology testing which looks for
Covid-19 antibodies. It doesn't include people currently infected and asymptomatic who
haven't had time to build up detectable antibodies.
Lucky > Ken Bomant • 4 months ago

As if nomenclature is the biggest issue with this useless study.
△ ▽ 1 • Reply • Share ›
Michael O'Hare • 5 months ago

The problem I see with this is using Sea Princess cases to estimate asymptomatic cases.
Although this figure may be true for the genuinely asymptomatic ie absolutely no symptoms, Most
may well have mild symptoms, slight headache, slightly increased temperature which are too mild
to trouble them and soon pass. As they develop no futher the infected person simply dismisses
them. On the Sea Princess everyone was asked had they had any sort of symptom however mild.
The result was that asymptomatic cases on board ship were almost certainly reduced compared
to a real world situation. A number of papers using antibody tests have come up with much
increased percentages.
Ken Bomant > Michael O'Hare • 5 months ago

Well said!
gerry fitzgerald • 5 months ago

Hello, great article but I found a minor error or actually substantial, you mentioned "40,835
confirmed cases and 1,620 confirmed deaths through April 20, 2020 and an IFR rate of 1.3
percent the actually rate is closer to 4% which would be inline with the original estimates of the
CDC and NIH who said "we estimate that 75 % of Americans will get the virus and out of that
approximately 4 percent will die" which means that just about 5,000,000 Americans will die from
the disease, if we use the Governments numbers this is a very alarming event, let's hope the
Government actuaries were wrong
Slate > gerry fitzgerald • 5 months ago

This "research" is really nothing more than clickbait. The 1.3% rate mentioned isn't IFR but
IFR-S(ymptomatic). Much different number as statistics are showing much higher infection
rates of Covid-19 in places like NY state and NYC.
H i li k t G
https://www.healthaffairs.org/doi/full/10.1377/hlthaff.2020.00455
t d hi h h th IFR t i G C id "h t t" t16/22
Here is a link to a German study which shows the IFR rate in a German Covid "hot spot" to
be 0.37%. https://medicalxpress.com/n...
Tom W > gerry fitzgerald • 5 months ago

I don't believe that your first comment about 4% CFR points to any kind of error in the
analysis. Those first counts of deaths and cases form the baseline figure from which to
apply the other estimates, which is the decay over time in the error between actual counts
and confirmed counts. They start out accepting that there is a large error, and model the
decay in that error (in both cases and deaths) to arrive at the presumed counts (at
theoretical time of infinity) when the error reaches zero. In April, the measured IFR-S was
closer to 4%, projecting forward with the errors removed, it would be expected to approach
1.3%. This becomes a useful statistic for helping to make decisions about moving forward.
△ ▽ • Reply • Share ›
Ken Bomant > Tom W • 5 months ago

I wonder how they decided that the number of cases should be
40,835÷1.3%×3.97%= 124 703 ? How do the researchers know? :)
Ken Bomant > Tom W • 5 months ago

Ok, that explains it!! Thanks. My simple way was to offset recent cases (i.e. about
7 days of new cases - people who just got it and thus would not die in that period )
with symptomatics that have fallen through the 'cracks' in the testing system.
Steven G. Cosby • 5 months ago

To the editor: Is this to be a peer reviewed article? Opening with the three assumptions with one
being an article from the Washington Post? Stating something as bold as the death rate of the first
few weeks of Covid19 outbreak is actually understated. But not mentioning comorbidity and the
complexity associated with making predictions and assumptions about Covid19 and its economic
effect on morbidity. Yes I said "economic". (Use implications of QALYS if you want.) You cannot
make the same IFR of Covid19 about the death of a 90 year old that you would about a 10 year
old. Author just seems to be sensationalizing.
1△ ▽ 1 • Reply • Share ›
Lucky • 4 months ago

This appears to be a politically driven study. Maybe for the fear-mongering corporate media and
powerful interests to help justify an over-the-top authoritarian state. While simultaneously
facilitating the pretext for a global reset to many aspects of society WITHOUT any democratic
discourse. Not to mention a blatant social engineering psyop to permit, even invite, both.
Without even touching on policies and techical aspects of the testing (which fundamentally
distorts the raw data) here are few highlights from the study which validate that, at this stage in
the game, this study is useless beyond being a political tool:
"The overall IFR for COVID-19 should be lower when we account for cases that remain and
The overall IFR for COVID 19 should be lower when we account for cases that remain and
recover without symptoms"
As if data isn't available from literally all over the world that offers a reasonable idea. No room in
the model for that, nope.
"Substantial variation across patient demographics likely exists and should be the focus of future
studies."
Like if your not 80 (or generally past the age of average life expectancy) and suffering from some
see more
Tommy Shelby > Lucky • 2 months ago

I beg to differ. Reviewing the article convinces me that the author is a serious student of
the issues involved making a carefully thought out development to gain some sense of the
incident mortality. While I am likely to suspect that his results understate the eventual
incidence (derived from data I have accessed that is readily available) I welcome his
contribution to searching for answers. And this is not a final answer. The author is a
searcher for a statistic; statistics are inherently variable. The better challenge is to produce
a more defensible determination of the statistic.
Lucky > Tommy Shelby • 2 months ago

Beg all you want, that won't improve the study. Which, at this point, is demonstrably
pathetic on all fronts referenced above and more.
Jouni Valkonen • 4 months ago • edited

The author does not define what they mean with symptoms. In order to get sick, one have to get
the virus into lung alveoli, where there are type 2 pneumocytes. If it stays at upper respiratory
track, it does not cause symptoms or may have only slight symptoms such as mild fever, throat
pain or loss of smell.
The gold standard of diagnosis is the x-ray imaging of the lung. X-ray images reveals if one has
infection in the lung, where it must be in order to create dangerous form of disease. If there are no
characteristic marks in the lung images, it means probably that one had too small exposure for
the virus.
As Sars-cov2 does not treat all age groups equally. It would make more sense to define IFR-S for
each age group.
Gary Underwood • 5 months ago • edited

Looking at the current data on confirmed cases and antibody positives in Iowa there are 26
infections per positive as of yesterday. Besides the asymptomatics and people who developed
antibodies that may have had symptoms that are not recorded the uncertainty is so high that
applying so much weight to reported positives drives mortality high. Iowa believes 13% have been
infected as of May 20. If you use the numerator from this method a second rate of all infections
irrespective of symptoms can be derived. Of course this applies just to Iowa. They do publish all
county data.
Wes Pegden • 5 months ago

A comment on assumption (2), which you state as "both the errors are declining over time", and
later as "the errors in both the numerator and the denominator are declining."
It seems, in fact, that this analysis requires not just that the errors are declining over time, but that
they are asymptotically approaching 0.
Since it is easy to imagine a world where the errors (particular in the denominator) are declining
over time but approaching a horizontal asymptote other than 0, and this is a likely source of
significant error for this kind of analysis, it would be helpful to clarify this assumption more clearly
at the outset,
Urvi Shah • 5 months ago

Hello,
This a very insightful article shedding light on the nuances behind the fatality rate. Have you
considered trending the mortality rate? I wonder if/how any herd immunity may affect the mortality
rate.
Montsalvat • 5 months ago

This looks very woolly stuff indeed, based on nebulous assumptions. In UK they have only
recently started reporting Covid-19 on death certificates which in itself is unreliable because you
can have Covid-19 but die of something else. You would need to determine deaths - Covid-19
only (no underlying issues), Covid-19 with comorbidities and compare against excess deaths say
between 2015-20. I saw a lot of certificates last November with CAP and AKI for example, my
guess is many of these may have been Covid-19
Tom W • 5 months ago • edited

Interesting article. I appreciate the work that went into this. I hope you can look at the study done
recently by Rinaldi and Paradisi (https://www.medrxiv.org/con..., also not peer reviewed, but also
very interesting use of modeling with the data they were able to acquire. They were able to
determine reasonable estimates of total infection rate in northern Italy using wide-scale serology
tests, and also used a 5-year history of all deaths to estimate "total COVID deaths". The extra
spike in 2020 deaths was remarkable.
Results were a 1.29% overall infection fatality rate, with substantially higher infection rates, and
substantially higher COVID death counts than what the confirmed case numbers would suggest.
Again, good job on your study and your article. It adds to the knowledge of our current situation.
Thank you.
△ ▽ R l Sh
Anirban Basu > Tom W • 5 months ago

Thanks. Will certainly take a look. There is a lot that is made out of the prevalence of
asymptomatic cases, and most measurement of that rate are cross sectional, which
means many of those may actually go on and develop symptoms in the future. The truly
asymptomatic cases, who recover from the infection asymptomatically, do not contribute to
the number of covid deaths ( they do contribute to the spread of infections and towards
achieving heard immunity). So for fatality , I think the rate among symptomatic cases, or
those who fall sick, makes more clinical sense. Also its directly comparable to the fatality
rate in flu, which is also measured only among those who fall sick with the virus.
Kathleen Fairman > Anirban Basu • 5 months ago

There are several flaws in the logic of your response here. First, you state that both
the influenza denominator and the Covid-19 denominator represent "those who fall
sick." However, people with influenza are encouraged in direct-to-consumer
advertising to seek care early, before they become very sick, in order to get
antivirals that will shorten the illness. In contrast, CDC guidance advises seeking
care for Covid-19 only if the patient develops trouble breathing, chest pain, is
confused, cannot stay awake, or is turning blue. You cannot compare the
denominators of these two patient populations as if they were clinically equivalent.
Second, it's not clear why you chose the Diamond Princess to represent the
proportion of people who are asymptomatic, as multiple studies based on antibody
testing have done this. See the Santa Clara study, the LA County study, and the
Gangelt study. All the antibody studies, using various methods, seem to be
triangulating around an infection fatality rate of about 0.1% to 0.4%. The only
exception of which I am aware is New York, where applying the results of antibody
testing would suggest an IFR of about 0.6%--likely because of the "perfect storm"
of comorbidities, poverty, and questionable policy decisions. Third, please do not
refer to a case fatality rate, especially when the denominator represents the very
most severe cases, as if it were an infection fatality rate. The public is confused
enough without adding to the confusion by suggesting that these are equivalent.
Anirban Basu > Kathleen Fairman • 5 months ago

I think you are confusing between those who fall sick versus those who
seek care. These are fundamentally different population and numbers. CDC
estimates of fatality rate of flue is among those who fall sick, not those who
seek care. Iti is a modeled estimated. If you go with the reported numbers
flu reported case fatality rate would be six times lower than estimated (Faust
and Rio JAMA 2020). CDC takes into account false positive and negatives
and estimates the higher fatality rate. Similarly, I model the reported and
much higher case fatality rate to estimate a lower infection fatality rate
among those who fall sick, irrespective whether they seek care of not. This
makes these rates an "apples -to-apples" comparison.
I relied on Princess diamond data as it is the only data that is longitudinal in

the US to estimate the asymptotic rate. Applying this rate and the estimate
IFR-S does give the total prevalence of the infection in the community close
to what well designed seroprevalance study finds (Sood et al. JAMA 2020).
At the end of the day it does not matter if 90% of the population has the
infection, what matters is that even if 10% fall sick with it, we are looking at
large number of deaths. To put this in perspective, the estimated number of
flu deaths the whole of last year was 35K, and we have nearly 100K deaths
in the first 2.5 months of the pandemic. And BTW, only 4% of the population
is infected.
I wish all good health going forward.

Kathleen Fairman > Anirban Basu • 5 months ago

Thank you for referring me to the JAMA article, which I had not seen. I was
already aware that CDC estimates influenza counts (morbidity,
hospitalizations, and mortality) based on samples and models, with
associated confidence intervals. Please note that according to the CDC
page cited in the JAMA article, the influenza surveillance system samples
data from multiple sources. These include “information on outpatient visits
to health care providers for influenza-like illness … collected through the
U.S. Outpatient Influenza-like Illness Surveillance Network,” laboratory data
collected through the National Respiratory and Enteric Virus Surveillance
System, and Behavioral Risk Factor Surveillance Survey data. To be tested
in a laboratory for influenza or be seen in a physician office and deemed
“sick” with influenza, a patient must present for care.
(https://www.cdc.gov/flu/wee.... The BRFSS asks respondents “whether
they did or did not seek medical care for an influenza-like illness in the prior
influenza season.” https://www.cdc.gov/flu/abo... Thus, it does matter that
people with influenza have typically been encouraged to seek care when
they have only mild symptoms, in contrast to people with Covid-19, who
were encouraged to self-isolate if possible (particularly early in the
see more
7500 Old Georgetown Road, Suite 600

Basu, A. (2020) - Estimating The Infection Fatality Rate Among Symptomatic COVID-19 Cases

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Basu, A. (2020) - Estimating The Infection Fatality Rate Among Symptomatic COVID-19 Cases

Загружено:

Авторское право:

Доступные форматы

10/7/2020 Estimating The Infection Fatality Rate Among Symptomatic COVID-19 Cases In The United States | Health Affairs

RESEARCH ARTICLE COVID-19

Estimating The Infection Fatality Rate Among

PUBLISHED: MAY 07, 2020  Free Access https://doi.org/10.1377/hlthaff.2020.00455

STUDY DATA AND METHODS

comprehensive reporting of all COVID-19 deaths. However, as long as improvements

The online appendix contains a detailed mathematical formulation of these assumptions

There were several limitations to my analysis. First, I acknowledge that my estimate of

Second, my estimated COVID-19 IFR-S may be slightly conservative. My approach was to

Exhibit 1 Estimated COVID-19 infection fatality rates among symptomatic patients

Exhibit 2 Estimated COVID-19 infection fatality rates among symptomatic patients

To assess the validity of my prediction model, I forecasted county-speci c reported case

“Constraints in the supply of health care

impact of alternative policy levers to contain this pandemic.

Bethesda, Maryland 20814

Health Affairs Comment Policy

 Recommend t Tweet f Share Sort by Best

Join the discussion…

Ken Bomant • 5 months ago

ANB2017 > Ken Bomant • 4 months ago

Lucky > Ken Bomant • 4 months ago

Michael O'Hare • 5 months ago

Ken Bomant > Michael O'Hare • 5 months ago

gerry fitzgerald • 5 months ago

Slate > gerry fitzgerald • 5 months ago

Tom W > gerry fitzgerald • 5 months ago

Ken Bomant > Tom W • 5 months ago

Ken Bomant > Tom W • 5 months ago

Steven G. Cosby • 5 months ago

Lucky • 4 months ago

Tommy Shelby > Lucky • 2 months ago

Lucky > Tommy Shelby • 2 months ago

Jouni Valkonen • 4 months ago • edited

Gary Underwood • 5 months ago • edited

Wes Pegden • 5 months ago

Urvi Shah • 5 months ago

Montsalvat • 5 months ago

Tom W • 5 months ago • edited

Anirban Basu > Tom W • 5 months ago

Kathleen Fairman > Anirban Basu • 5 months ago

Anirban Basu > Kathleen Fairman • 5 months ago

I relied on Princess diamond data as it is the only data that is longitudinal in

I wish all good health going forward.

Kathleen Fairman > Anirban Basu • 5 months ago

7500 Old Georgetown Road, Suite 600

Вам также может понравиться