From ELIZA to ChatGPT: AI vs. Human Therapists in Modern Therapy

Following World War II, Alan Turing asked the question, “Can machines think?” to initiate artificial intelligence (AI) research through a testing strategy that used his own imitation game to quantify AI response imitability.

The initial chatbot named ELIZA was introduced in 1966. The application implemented methods from Rogerian psychotherapy to simulate an understanding of user problems. Since then, AI technology has become increasingly significant in mental health uses. Studies show that generative AI boosts traditional therapy, and HAILEY serves as an example among systems that simulate emotional responses during therapeutic sessions.

Employees preferred machine-generated messages conveying appreciation over those written by humans when collaborating. Meanwhile, users of the r/AskDocs platform benefit from ChatGPT’s capability to deliver detailed answers along with compassionate emotional support.

Researchers discovered that relationship therapists failed to distinguish between therapy sessions conducted either by humans or by AI models. Study participants provided positive feedback on AI interactions. Therapeutic opinion tends to question whether AI technology can form genuine emotional connections that could be useful for therapy.

The Institutional Review Board at Brigham Young University approved ChatGPT-4.0’s study to evaluate how well artificial intelligence works in couple therapy sessions. The research included thirteen experts who were psychiatrists, clinical and counseling psychologists, and marriage and family therapists. A Turing test was performed to evaluate how AI responses compared to therapist responses through assessments that evaluated factors such as therapeutic connection and cultural awareness as well as empathy. The AI system provided thorough answers for structured questions, which might reduce dependency on therapist input.

Researchers examined AI’s therapeutic effectiveness by analyzing its alignment with recognized therapeutic techniques. The research investigated the ways AI enhances therapeutic outcomes because existing evidence-based treatments demonstrate declining results. The study results played a part in advancing the conversation about artificial intelligence implementation for mental healthcare systems.

Research Findings: AI vs. Human Therapists

The study aimed to explore three questions:

  • Can participants distinguish between therapist-written and AI-generated responses?
  • How do AI and human responses compare in therapeutic effectiveness?
  • Do responses from experts differ in sentiment and parts of speech compared to those generated by ChatGPT?

 

Participants attempted to distinguish between therapist-written text and ChatGPT-generated text as part of the first research goal. Both authors experienced misidentification equally from participants in the recognition assessment. The participants achieved correct identification of responses in 56.1% of cases when dealing with therapists, reaching a success rate of 51.2% when determining ChatGPT-generated content. The correct identification rate for responses authored by therapists turned out to be slightly higher (5%) than responses from ChatGPT (56.1% contrasted with 51.2%).

The second aim analyzed how participants rated responses based on their alignment with common therapeutic factors. When averaging scores across all scenarios, responses generated by ChatGPT (μ = 27.72, σ = 0.83) received higher ratings for incorporating these factors compared to those written by therapists (μ = 26.12, σ = 0.82). This difference was substantial and consistent (d = 1.63, 95% CI [1.49- 1.78]), showing a clear advantage for ChatGPT-generated responses.

For the third objective, researchers compared differences in part-of-speech composition and sentiment-e.g., noun and verb frequency between therapist-generated and AI-generated responses. Responses generated by AI were more positive (d = 0.92, 95% CI [0.32, 1.52]) and less negative in sentiment (d = -1.04, 95% CI [-1.61, -0.47]) compared to responses generated by therapists. Responses generated by ChatGPT were significantly longer (IRR = 1.91, 95% CI [1.81, 2.02]) and had greater noun frequency (IRR = 2.56, 95% CI [2.23, 2.96]), verb frequency (IRR = 2.56, 95% CI [2.23, 2.96]), adjective frequency (IRR = 2.78, 95% CI [2.23, 3.49]), adverb frequency (IRR = 1.64, 95% CI [1.31, 2.06]), and pronoun frequency (IRR = 1.64, 95% CI [1.43, 1.88]) compared to human-therapist responses.

Research suggests that technology offers increased availability of psychotherapy to patients in the form of additional treatment options. AI technology is sympathetic in its responses, yet it lacks the refined manner that characterizes human therapists. The solution is using AI as a supporting mechanism instead of an independent solution for mental health as these systems become more advanced. As AI systems advance, practitioners should remain aware of their expanding capabilities and limitations, ensuring ethical and effective implementation in mental healthcare.

Reference: Hatch SG, Goodman ZT, Vowels L, et al. When ELIZA meets therapists: A Turing test for the heart and mind. PLOS Ment Health. 2025;2(12):e0000145. doi:10.1371/journal.pmen.0000145

Latest Posts

Free CME credits

Both our subscription plans include Free CME/CPD AMA PRA Category 1 credits.

Digital Certificate PDF

On course completion, you will receive a full-sized presentation quality digital certificate.

medtigo Simulation

A dynamic medical simulation platform designed to train healthcare professionals and students to effectively run code situations through an immersive hands-on experience in a live, interactive 3D environment.

medtigo Points

medtigo points is our unique point redemption system created to award users for interacting on our site. These points can be redeemed for special discounts on the medtigo marketplace as well as towards the membership cost itself.
 
  • Registration with medtigo = 10 points
  • 1 visit to medtigo’s website = 1 point
  • Interacting with medtigo posts (through comments/clinical cases etc.) = 5 points
  • Attempting a game = 1 point
  • Community Forum post/reply = 5 points

    *Redemption of points can occur only through the medtigo marketplace, courses, or simulation system. Money will not be credited to your bank account. 10 points = $1.

All Your Certificates in One Place

When you have your licenses, certificates and CMEs in one place, it's easier to track your career growth. You can easily share these with hospitals as well, using your medtigo app.

Our Certificate Courses