Mass General Brigham investigators have created an artificial intelligence tool that scans electronic health records to find cases of long COVID, a mix of often mysterious and enduring symptoms, such as fatigue, chronic cough, and brain fog after infection from the SARS-CoV-2 virus.
The results, published in the journal Med, could help identify more people who should be getting care for this potentially debilitating condition. They also found the number of cases they identified suggests the prevalence of long COVID is very likely being greatly under-recognized.
“This AI tool can take the fog out of a diagnostic process that is foggy, sharpening it so that doctors have a focused way to make sense of a difficult condition,” senior author Hossein Estiri, Ph.D., CAIBILS head of AI research and an associate professor of Medicine at Harvard Medical School, said. “We may finally be able to see this (long COVID) for what it is and really how they can treat it with this work.”
A wide range of symptoms can be included in long COVID-19, also called Post-Acute Sequelae of SARS-CoV-2 infection (PASC). For their study, Estiri and colleagues termed it a diagnosis of exclusion that is also infection-associated. Leveraging the Mass General Brigham system, the AI tool developed its algorithm by drawing de-identified patient data from the clinical records of over 300,000 patients across 14 hospitals and 20 community health centers.
Unlike a single diagnosis code, the AI uses a new method called “precision phenotyping” developed by Estiri and colleagues that digs through each record to identify signs and conditions associated with COVID-19 and follow symptoms over time to distinguish them from other maladies.
For instance, the output helps to determine if shortness of breath is due to existing medical conditions such as asthma, heart failure, or long-term COVID-19. The tool would only flag a patient as having long-term COVID when all other possibilities had been exhausted. Unsure of which threads to pull while balancing busy caseloads physicians often face the challenge of navigating a tangled web of symptoms and medical histories. ‘A tool powered by AI that does it methodically for it could be a game changer,’ said Alaleh Azhir, MD, the co-lead author and internal medicine resident at Brigham Women’s Hospital, a Mass General Brigham (formerly Mass General) healthcare system.
This new method, the researchers add, may also help reduce biases inherent in existing diagnostics for long COVID, since patients diagnosed with the official ICD-10 diagnostic code for long COVID tend to be those with easier access to health care.
Their tool turned out to be up to 3 percent more accurate than what ICD 10 codes pick up, but less biased. For example, their study showed the people they identified long COVID patients matched the broader demographic diversity of Massachusetts, rather than the types of people long COVID algorithms created with a single diagnostic code or a single encounter with a clinician representing, for instance, people with more accessible care.
Limitations of the study and AI tool include that health record data used in the algorithm to account for long-term COVID symptoms might include less detailed accompanying data from physicians in post-visit clinical notes.
The algorithm also didn’t capture prior worsening of a condition, which could be a symptom of long COVID. Let’s say a patient with COPD and prior episodes of it did not improve before developing COVID-19, and prior episodes got worse prior to COVID-19, the algorithm might still step up and remove them because their persisting symptoms may still be a long COVID indicator.
The study examined 85,364 COVID-19 cases, with 170,497 were post-pandemic control and 39,817 as a pre-pandemic control group. The mean age of the COVID-19 patients was 53.6 years and 62.6% of participants were female. Future studies could examine the algorithm performance in cohorts of patients with certain conditions, such as COPD or diabetes.
The researchers will also release this algorithm on open access so physicians and health care systems everywhere can use it in their patient populations. This work also opens the door to better clinical care and may help to launch future research into what genetic and biochemical factors contribute to COVID-19’s many subtypes. “Now, questions about the true burden of long COVID, questions that have so far been out of reach, now seem more possible,” Estiriz said.
Reference: Azhir A, HĂĽgel J, Tian J, et al. Precision phenotyping for curating research cohorts of patients with unexplained post-acute sequelae of COVID-19. Med. Published online November 2024.
doi: https://doi.org/10.1016/j.medj.2024.10.009Â


