In Harvard study, AI offered more accurate emergency room diagnoses than two human doctors

By Topline Newsroom

4 hours ago1 min readSource: techcrunch.com

In Harvard study, AI offered more accurate emergency room diagnoses than two human doctors

A new study examines how large language models perform in a variety of medical contexts, including real emergency room cases — where at least one model seemed to be more accurate than human doctors.

From the source

In Harvard study, AI offered more accurate emergency room diagnoses than two human doctors Anthony Ha 11:00 AM PDT · May 3, 2026 A new study examines how large language models perform in a variety of medical contexts, including real emergency room cases — where at least one model seemed to be more accurate than human doctors.

The study was published this week in Science and comes from a research team led by physicians and computer scientists at Harvard Medical School and Beth Israel Deaconess Medical Center. The researchers said they conducted a variety of experiments to measure how OpenAI’s models compared to human physicians.

In one experiment, researchers focused on 76 patients who came into the Beth Israel emergency room, comparing the diagnoses offered by two internal medicine attending physicians to those generated by OpenAI’s o1 and 4o models. These diagnoses were assessed by two other attending physicians, who did not know which ones came from humans and which came from AI.

Who and what

Key names and topics in this story: Harvard.

Where to follow next

Read the full piece at techcrunch.com
More from our AI & prompts coverage

#ai#harvard

A Coding Implementation to Explore and Analyze the TaskTrove Dataset with Streaming Parsing Visualization and Verifier Detection

In this tutorial, we take a deep dive into the TaskTrove dataset on Hugging Face and build a complete, practical workflow to efficiently explore it. Instead of downloading the full multi-gigabyte dataset, we stream it directly and work with individual samples in real time. We beg

‘This is fine’ creator says AI startup stole his art

The ad comes from Artisan, the AI startup behind billboards urging businesses to "stop hiring humans."

A Developer’s Guide to Systematic Prompting: Mastering Negative Constraints, Structured JSON Outputs, and Multi-Hypothesis Verbalized Sampling

Most developers treat prompting as an afterthought—write something reasonable, observe the output, and iterate if needed. That approach works until reliability becomes critical. As LLMs move into production systems, the difference between a prompt that usually works and one that

A New NVIDIA Research Shows Speculative Decoding in NeMo RL Achieves 1.8× Rollout Generation Speedup at 8B and Projects 2.5× End-to-End Speedup at 235B

A new paper from NVIDIA Research integrates speculative decoding directly into NeMo RL with a vLLM backend, delivering lossless rollout acceleration at both 8B and projected 235B model scales. The post A New NVIDIA Research Shows Speculative Decoding in NeMo RL Achieves 1.8× Roll