The first case of COVID-19 in India was reported in the state of Kerala in a student who returned from Wuhan, China, on 30 January 2020. Since then, the infection has spread throughout the country, with every state reporting at least one positive case of COVID-19. However, the reported cases may not give the full picture of the extent of the infection as testing coverage has not been complete. Data from different states suggest that the number of tests conducted were in the range of 29 to 182 per 1000 residents (as of 10th October 2020). Although patients hospitalized with symptoms are typically tested, those who develop mild symptoms at home and those who do not develop symptoms are unlikely to be tested. The testing protocols used in different states have also changed significantly over the duration of the pandemic. The current situation nonetheless posits a pertinent question of interest on what fraction of the true infections within the state have currently been recorded. A recent article in the British medical journal BMJ open by the authors through a delay-adjusted case fatality ratio addresses this question. Our study estimates the fraction of case reported based on different Case Fatality ratios using the data up to 10th October 2020.
It is assumed in the study that the deaths due to COVID-19 reported in different states are accurate. Although cases may have significant under-reporting, deaths are typically reported correctly. This is because patients with severe symptoms typically report themselves to a hospital. As a result, any patient who dies from the COVID-19 disease is likely to have been tested. However, recent news point out the inaccurate death reporting in many states which is not considered in the study.
A naive computation of the ratio of deaths-to-date to cases-to-date from a region gives an inaccurate estimate of the observed case fatality ratio (CFR) of the outbreak in a region. This is because the deaths used in the numerator undercounts additional deaths that may arise from the cases observed to date. This issue can be addressed by using the distribution of delay from hospitalization to deaths for cases that are fatal. With this correction, one can compute an adjusted CFR for each region being studied.

In a region where the cases and deaths have been fully reported, the adjusted-CFR tends to match the true CFR of COVID-19. For example, a value of 1.4% for the true CFR has been reported in a study from China. A different published study based on data from China puts the estimate at 0.66%. More recent reports from the United States based on seroprevalence studies provide much lower estimates as low as 0.1%. However, in regions where cases have been under-reported, we expect the adjusted-CFR to be significantly higher than the true-CFR. Hence, computing the ratio of the true-CFR to the adjusted CFR gives an estimate of the fraction of cases that have been reported. The study estimates the fraction of cases reported in Kerala for these three true-CFR values (1.4%, 0.66%, and 0.1%) (Table 1)
Table 1. Estimates of fraction of cases reported in different states (As of 10th October 2020)
A more accurate estimation of the underreporting of cases in Kerala may be obtained through serological testing for COVID-19 antibodies among the general public. Randomized antibody testing in a general population may be used to estimate the fraction of the people who have the COVID-19 antibody in their system, which in turn serves as an estimate of the total population who have been exposed to the virus.
Article information: https://bmjopen.bmj.com/content/11/1/e042584.full
(The author is a data scientist and machine learning expert. He completed his Ph.D. from Georgia Institute of Technology, United States. His alma maters include University of Pavia, Italy, Indian Institute of Technology Madras and Kerala University)