Comparative Analysis of GPT‑4o and Emergency Medicine Residents in Toxicologic Emergency Management

Poisoning remains a major cause of morbidity and mortality in the young population. Effective patient management is often limited due to inadequate knowledge of physicians in selecting proper therapeutic interventions for toxicology cases. Drug databases and Poison Control Centers (PCCs) both support physicians in making informed clinical decisions. However, only 10.9% of tertiary and 12.3% of primary hospital emergency physicians contacted PCCs for poisoning-related advice. In such cases, the decision support systems based on artificial intelligence (AI) consist of effective solutions.

The study assessed the usefulness of Generative Pre-trained Transformer 4 Omni (GPT-4o) in assisting emergency medicine residents with the management and treatment of toxicological cases. Conducted in September 2024, the study compared the performance of emergency medicine residents with that of GPT-4o in handling toxicology-related emergency department (ED) scenarios. The study protocol and consent procedures were approved by the Institutional Review Board.

A total of 30 emergency medicine residents, comprising 14 senior residents (SRs) and 16 junior residents (JRs), participated in the study along with the GPT-4o model. Logistics limitations and the total number of qualified emergency medicine residents in the department at the time of the study were considered to determine the sample size. The residents were divided into two groups based on training level to evaluate the effect of clinical practice on patient management. A medical toxicologist with extensive clinical and academic knowledge was responsible for determining the gold standard (GS) responses. This specialist also monitored patients and contributed to the clinical management of complex toxicology cases in the dedicated toxicology inpatient unit.

In this study, researchers used the commercial version of GPT-4o as a clinical decision support tool, which was accessed through the ChatGPT program. Responses from the model were based on data till October 2023. Complete case information should be included in each scenario, and all patient identities were eliminated to maintain their privacy. The clinical cases were randomly assigned to residents using a custom Python-based randomization process.

Only the first straight answer was considered for analysis when the GPT-4o shows both a brief response and an extended comment. Cohen’s kappa coefficient was used to evaluate the accuracy between the replies from the residents, GS, and GPT-4o. When the GS recommended antidotal therapy, GPT-4o demonstrated favourable and comparable agreement with SRs: p<0.001, κ=0.704, and SRs: p<0.001, κ=0.710. While JRs showed a lower level of agreement as p<0.001, κ=0.451. GPT-4o showed higher agreement with the GS (p<0.001, κ=0.632), SRs (p<0.001, κ=0.551), and JRs (p<0.001, κ=0.293) compared to both resident groups. GPT-4o outperformed residents in recommending appropriate elimination methods.

The lower agreement among JRs may be attributed to limited hands-on toxicology experience and insufficient opportunities to apply theoretical knowledge in clinical settings. Even JRs in early specialization training demonstrated better understanding of toxicological problems with no specialized training. Additionally, prior studies showed that Gemini AI and GPT 3.5 models performed similarly to emergency medicine residents in answering 20 multiple-choice toxicology questions. AI-based systems showed a promising clinical support tool; they should be used to complement, not replace, human decision-making due to the risk of inaccurate or misleading recommendations.

Reference: Ozer V, Bulbul O, Pasli S, Karakullukcu S, Kazzi Z, Turedi S. Performance of GPT-4o in the management of toxicological exposures: a comparative analysis with emergency medicine residents. Hong Kong J Emerg Med. 2025;e70031. doi:10.1002/hkj2.70031

Latest Posts

Free CME credits

Both our subscription plans include Free CME/CPD AMA PRA Category 1 credits.

Digital Certificate PDF

On course completion, you will receive a full-sized presentation quality digital certificate.

medtigo Simulation

A dynamic medical simulation platform designed to train healthcare professionals and students to effectively run code situations through an immersive hands-on experience in a live, interactive 3D environment.

medtigo Points

medtigo points is our unique point redemption system created to award users for interacting on our site. These points can be redeemed for special discounts on the medtigo marketplace as well as towards the membership cost itself.
 
  • Registration with medtigo = 10 points
  • 1 visit to medtigo’s website = 1 point
  • Interacting with medtigo posts (through comments/clinical cases etc.) = 5 points
  • Attempting a game = 1 point
  • Community Forum post/reply = 5 points

    *Redemption of points can occur only through the medtigo marketplace, courses, or simulation system. Money will not be credited to your bank account. 10 points = $1.

All Your Certificates in One Place

When you have your licenses, certificates and CMEs in one place, it's easier to track your career growth. You can easily share these with hospitals as well, using your medtigo app.

Our Certificate Courses