Shortly after the artificial intelligence company OpenAI debuted its ChatGPT chatbot, the program became popular. Five days after its release, it had already amassed one million users. Since then, it has been referred to as world-changing, a turning point for artificial intelligence, and the start of a new technological revolution.
Like others, Stat News began examining potential medical uses for ChatGPT, which was trained on more than 570 gigabytes of online textual data taken from books, web texts, Wikipedia, and other online content, including some focusing on medicine and health care.
Although the potential use of AI for medical applications, such as ChatGPT, excites us, errors, confabulation, and prejudice make us reluctant to advocate its use outside of specific circumstances. These include expediting teaching and administrative activities as well as supporting clinical decision-making, despite the fact that even this application has considerable issues and hazards.
In the United States, medical education continues to shift from a system based on memorization and retention of information to one that emphasizes the curation and application of medical knowledge.
AI systems such as ChatGPT could facilitate this transition by assisting medical students and physicians in learning more efficiently, from creating unique memory devices (“create a mnemonic for the names of the cranial nerves”) to explaining complex concepts in language of varying complexity (“explain tetralogy of Fallot to me like I’m a 10th grader, a first-year medical student, or a cardiology fellow”).
By inquiring about ChatGPT, Stat News discovered that it could aid in the preparation for standardized medical tests by generating high-quality practice questions with thorough explanations of the correct and erroneous responses. In a recent study published as a preprint, in which ChatGPT was mentioned as a co-author, the application was found to have passed the first two steps of the United States Medical Licensing Exam, the national exam that most U.S. medical students take to qualify for medical licensure.
ChatGPT’s responsive design can also be used to imitate a patient by requesting a medical history, physical exam findings, test results, and more. With a high degree of skepticism, ChatGPT’s ability to respond to follow-up inquiries could create opportunity for physicians to improve their diagnostic skills and clinical acumen.
Although ChatGPT can be useful to physicians, they must proceed with caution and not rely on it as a primary source without first verifying its accuracy. In 2018, the most recent year for which there are reliable figures, 70% of physicians reported spending at least 10 hours per week on paperwork and administrative duties, with nearly one-third spending 20 hours or more.
ChatGPT could be utilized to help healthcare professionals save time on nonclinical chores, which lead to burnout and take away from time spent with patients. Stat News discovered that ChatGPT’s information contains the Current Procedural Terminology (CPT) code set, a standardized system for identifying medical procedures and services used by most physicians to bill for procedures or the care they deliver.
ChatGPT provided the correct billing code for Covid vaccinations but incorrect ones for amniocentesis and x-ray of the sacrum when Stat News requested multiple billing codes to evaluate its functionality. In other words, without major improvement, the situation is currently unsatisfactory.
Clinicians spend an extraordinary amount of time drafting letters advocating for patients’ needs with insurance companies and third parties. This time-consuming task could be aided with ChatGPT. Stat News asked ChatGPT, “Can you create a letter of authorization for Blue Cross about the use of transesophageal echocardiography on a patient with valve disease? The insurance company does not cover the expense. Please integrate references to scientific literature.” Stat News received a customised email within seconds that might serve as a time-saving template for this request. It took some revision but generally conveyed the intended message.
The use of ChatGPT in clinical care should be treated with greater caution than its application in education and administration. In clinical practice, ChatGPT could facilitate documentation by generating medical charts, progress notes, and discharge instructions. Jeremy Faust, an emergency medicine physician at Brigham and Women’s Hospital in Boston, put ChatGPT to the test by requesting a chart for a fictitious patient with a cough. The system responded with a template that Faust described as “strangely accurate.”
The potential is evident: assisting medical professionals in sorting through a list of symptoms, determining therapy dosages, suggesting a course of action, etc. However, the risk is serious. A key difficulty with ChatGPT is its ability to provide erroneous or fraudulent information. When asked to provide a differential diagnosis for postpartum hemorrhage, the application appeared to perform expertly and even provided supporting scientific data. But when Stat News investigated the sources, they found that none of them genuinely existed.
Faust found a similar problem when ChatGPT claimed that costochondritis, a common source of chest pain, can be caused by oral contraceptive pills but fabricated a study report to support this claim. This potential for fraud is especially concerning in light of a recent pre-print indicating that scientists have difficulties distinguishing between authentic research and abstracts created by ChatGPT.
Patients who use ChatGPT to study their symptoms, as many already do with Google and other search engines, are much more susceptible to receiving inaccurate information. Indeed, ChatGPT created a horrifyingly compelling explanation for how “crushed porcelain added to breast milk can aid in the digestion of infants.”
The possibility of bias in ChatGPT’s responses exacerbates our concerns regarding clinically inaccurate information. When a user requested ChatGPT to develop code to determine whether a person would be a good scientist based on their race and gender, the software defined a good scientist as a white man. While OpenAI may be able to filter out certain instances of explicit bias, Stat News are concerned about more subtle instances of bias that could contribute to the perpetuation of stigma and discrimination in health care.
Due to the smaller sample numbers and limited diversity of training data, such biases are possible. Given that ChatGPT was trained on more than 570 gigabytes of online textual data, the program’s flaws may instead represent the pervasiveness of bias on the internet.
The use of artificial intelligence tools will continue. They are currently employed as clinical decision support tools to assist predict kidney illness, simplify radiology reports, and accurately estimate leukemia remission rates. The recent release of Google’s Med-PaLM, a comparable AI model optimized for medicine, and OpenAI’s application programming interface, which may leverage ChatGPT to develop applications for health care, only emphasizes the technological revolution that is altering health care.