Meta-Analysis of Deep Learning in Medical Imaging Highlights Need for Standardized Reporting and Guidelines

Artificial intelligence (AI) and its subfield, deep learning, open new avenues for expressive and predictive or prescriptive analysis to make possible insight otherwise unattainable because of manual analyses. Algorithms based on deep learning, for example, architectures like convolutional neural networks (CNNs) are distanced from typical cases of machine learning since they maintain the training process in recognizing sophisticated representations to improve performance recognizing patterns from raw data and not through previous human engineering and domain expertise structuring of data and designing feature extractors.

Hence, the purpose of this meta-analysis study is to quantify the diagnostic accuracy of deep learning in specialty-specific radiologic imaging modalities to identify or classify disease and to assess the differences in methodologies and reporting from deep learning (DL) based radiological diagnosis, i.e., the most common pitfalls pervasive through the whole field and published in NPJ Digital Medicine.

The author aimed to assess the diagnostic accuracy of DL algorithms for detecting pathology in medical imaging. The author performed research in Medline and Embase up to January 2020, identifying 11,921 studies. Of these, 503 were included in the systematic review. For meta-analysis, 82 studies focused on ophthalmology, 82 on breast disease, and 115 on respiratory disease were included. Moreover, 224 studies from other specialties were reviewed qualitatively. Only peer-reviewed studies reporting on the diagnostic accuracy of DL algorithms for identifying pathology through medical imaging were included. The primary outcomes evaluated were diagnostic accuracy, study design, and reporting standards in the literature. The secondary outcomes were study design and quality of reporting. Estimates were pooled utilizing random-effects meta-analysis.

In ophthalmology, the area under the curve (AUC) ranged from 0.933 to 1 for diagnosing diabetic retinopathy, age-related macular degeneration, and glaucoma using retinal fundus photographs and optical coherence tomography. In respiratory imaging, AUCs ranged from 0.864 to 0.937 for diagnosing lung nodules or lung cancer on chest X-rays or computed tomography (CT) scans. For breast imaging, AUCs ranged from 0.868 to 0.909 for diagnosing breast cancer using mammograms, ultrasound, magnetic resonance imaging (MRI), and digital breast tomosynthesis. There was significant heterogeneity across studies, with considerable variation in methodology, terminology, and outcome measures. This variation may lead to an overestimation of the diagnostic accuracy of DL algorithms in medical imaging. Therefore, there is an urgent need for the development of AI-specific enhancement of the quality and transparency of health research (EQUATOR) guidelines, particularly the standards for reporting diagnostic accuracy studies (STARD) guidelines, to address key issues in this field.

Primarily, authors are of the opinion that numerous investigations assume methodological deficiencies or poor reporting, hence they are not good sources for estimating diagnostic accuracy. Hence, the derived estimates for diagnostic performance in our meta-analysis are too uncertain and seem to provide an overestimation of true accuracy.

Secondly, the authors did not conduct a quality assessment for the transparency of reporting in this review. This was because current guidelines STARD-2015 were not designed for DL studies and are not fully applicable to the specifics and nuances of DL research.

Thirdly, data are typically DL studies; hence, it was not possible to perform classical statistical comparisons of disease- or diagnostic-variable accuracies across imaging modalities. Furthermore, since the study was conducted for an overview of the literature concerning each specialty, breaking the imaging modalities into subsets for inter-subset comparisons and allowing the splitting of heterogeneity and variance would go beyond the scope of this review.

For the quality of DL research to flourish in the future, authors believe that the implementation of the following recommendations is required as a starting point:

  1. Availability of large, open-source, diverse anonymized datasets with annotations.
  2. Collaboration with academic centers to utilize their expertise in pragmatic trial design and methodology.
  3. Creation of AI-specific reporting standards.

Reference:

1) Hill, B.G., Koback, F.L. & Schilling, P.L. The risk of shortcutting in deep learning algorithms for medical imaging research. Sci Rep 14, 29224 (2024). https://doi.org/10.1038/s41598-024-79838-6

2) Aggarwal R, Sounderajah V, Martin G, et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med. 2021;4(1):65. doi:10.1038/s41746-021-00438-z

Latest Posts

Free CME credits

Both our subscription plans include Free CME/CPD AMA PRA Category 1 credits.

Digital Certificate PDF

On course completion, you will receive a full-sized presentation quality digital certificate.

medtigo Simulation

A dynamic medical simulation platform designed to train healthcare professionals and students to effectively run code situations through an immersive hands-on experience in a live, interactive 3D environment.

medtigo Points

medtigo points is our unique point redemption system created to award users for interacting on our site. These points can be redeemed for special discounts on the medtigo marketplace as well as towards the membership cost itself.
 
  • Registration with medtigo = 10 points
  • 1 visit to medtigo’s website = 1 point
  • Interacting with medtigo posts (through comments/clinical cases etc.) = 5 points
  • Attempting a game = 1 point
  • Community Forum post/reply = 5 points

    *Redemption of points can occur only through the medtigo marketplace, courses, or simulation system. Money will not be credited to your bank account. 10 points = $1.

All Your Certificates in One Place

When you have your licenses, certificates and CMEs in one place, it's easier to track your career growth. You can easily share these with hospitals as well, using your medtigo app.

Our Certificate Courses