I used Claude Code to get a second opinion on my MRI

AI
Healthcare
Medical Imaging
Trust & Safety

The post walks through feeding raw shoulder MRI DICOM files into Claude Code to get a second opinion after a clinic recommended treatment the author no longer trusted. Claude’s read pushed against parts of the clinic’s story, especially around calcification and whether the chosen therapies were indicated. That resonated because the clinic had also injected Traumeel, a homeopathic product, which made many people suspect the bigger problem was questionable care rather than a breakthrough in AI radiology.

Use LLMs here as a prep tool, not as a radiologist. If a model flags something important, turn that into a concrete second-opinion consult with another human specialist and the original images, not a diagnosis you act on yourself.

June 28, 2026
antoine.fi
Discuss on HN

Key insights

Calcification claim rests on the wrong modality

The apparent contradiction that drove part of the post falls apart once you know how different imaging modalities see different things. Ultrasound can miss small calcifications that plain radiograph or MRI may catch, so a report saying "no calcifications" on ultrasound does not cleanly rule them out. That means Claude may have sounded incisive while anchoring on a false certainty that a radiologist would not grant.

Do not treat a negative finding from one imaging modality as a universal negative. If a treatment decision hinges on a feature like calcification, ask which modality is best suited to detect it and whether that specific study was done.

Attribution:

sxg #1
rylando #1
2ap #1

Image reconstruction AI is not diagnosis AI

People working in MRI drew a sharp line between AI that reconstructs scans and AI that interprets them. Tools like Siemens Deep Resolve can improve acquisition speed and visual quality through reconstruction methods, but that does not mean a frontier chatbot can reason correctly about pathology in the resulting images. Conflating those two categories is how people talk themselves into believing progress in scanner software implies progress in medical judgment.

When vendors or clinicians mention AI in imaging, pin down whether they mean reconstruction, workflow, or diagnosis. Those are different products with different failure modes and very different evidence bars.

Attribution:

lostlogin #1 #2
uecker #1 #2

Persuasion is the core failure mode

The most worrying property is not just error. It is polished, adaptive confidence. People described getting contradictory medical answers from the same model across sessions, then watching each session drift toward whatever explanation the user started nudging. That makes these systems especially dangerous in medicine because they reward motivated reasoning while sounding like they are helping you think clearly.

If you use an LLM on a health question, rerun it in fresh sessions with neutral prompts and compare outputs before you trust anything. Big drift between runs is a warning sign to stop treating it as evidence and move to a human second opinion.

Attribution:

Aurornis #1
scheme271 #1
XorNot #1
UltraSane #1

Patients use AI because the healthcare system often fails first

The pro-AI anecdotes were not really about AI being medically superior. They were stories about diagnostic funnels, contradictory specialists, rushed care, and clinicians who would not explain their reasoning. In that environment, even a flawed tool has value because it gives patients terms to search, questions to ask, and enough confidence to seek another specialist who finally catches the missed issue.

If you lead a healthcare product or insurer workflow, the demand signal here is not "replace doctors with chatbots." It is that patients want explanation, coordination, and easier access to second opinions before they feel forced to improvise with consumer AI.

Attribution:

kgeist #1
madrox #1
gaolei8888 #1
thewanderer1983 #1

Good use is report explanation, not image reading

Several commenters who were otherwise positive on LLMs narrowed the viable use case to text and process. They trusted models to summarize reports, extract structured facts, explain terminology, and help formulate questions. They did not trust them to read MRI slices directly. That split matches what people see in practice. Models are much stronger when the task is grounded in report text, guidelines, or literature than when it depends on subtle visual or spatial interpretation.

Build patient-facing AI around reports, instructions, and care navigation first. Leave primary image interpretation to specialized systems with clinical validation, or better yet to radiologists.

Attribution:

serial_dev #1
eqvinox #1
idopmstuff #1

More information can worsen a medical mystery

A recurring framing was that medical diagnosis is not a tidy puzzle where more data always helps. In real care, more opinions, more scans, and more plausible narratives can deepen confusion because each source comes with its own uncertainty, incentives, and blind spots. LLMs amplify that problem by making every branch of the differential sound coherent, even when the user lacks the judgment to prune the tree.

When you are already in a confusing diagnostic process, optimize for decision quality, not information volume. Ask each source to narrow the options, explain what would change management, and say what evidence would falsify its current view.

Attribution:

john-tells-all #1
rvnx #1
jongjong #1

Against the grain

Patient-side research is still worth doing

A smaller but credible camp argued that dismissing AI outright misses why people use it. Long waits, short visits, and inaccessible specialists leave patients to prepare on their own. In that context, a model can be useful for learning vocabulary, understanding likely pathways, and arriving with sharper questions. The useful comparison is not "LLM versus ideal doctor" but "LLM versus no help for four months."

If expert access is delayed, use the model to prepare for the visit, not to replace it. Ask for likely differentials, key tests, and questions to bring, then verify those with a clinician.

Attribution:

rvnx #1 #2
hectdev #1

LLMs sometimes catch what clinicians miss

Some medical anecdotes cut against the dominant caution by showing models surfacing the correct alternative diagnosis or prompting a retest that later proved justified. The point was not that models are dependable, but that human care already has enough variance and blind spots that a cheap, instant cross-check can occasionally create real value. That is especially true when the model is working from text, labs, or established diagnostic criteria rather than raw imaging.

Do not ignore a model simply because it disagrees with the first doctor. If the disagreement is specific and testable, use it to request a concrete re-review, retest, or specialist referral.

Attribution:

energy123 #1
margorczynski #1
fuomag9 #1

Experts can benefit because they can audit

Some clinicians and experienced users said LLMs are genuinely useful inside expert hands. They can revive forgotten knowledge, summarize evidence, and broaden a differential, but only because the user can tell when the model is drifting. That undercuts the blanket claim that expert criticism means the tool is worthless. The real dividing line is not whether the model makes mistakes. It is whether the operator can reliably catch them.

Expect expert workflows to adopt LLMs sooner than patient self-diagnosis does. If you deploy them professionally, design the workflow around auditable outputs and clear points where a qualified person must overrule or discard the model.

Attribution:

tsoukase #1
GTP #1
baxtr #1

In plain english

Calcification ↩

A deposit of calcium in tissue that can sometimes be seen on medical imaging and may affect diagnosis or treatment choice.

Deep Resolve ↩

A Siemens Healthineers MRI reconstruction product that uses machine learning methods to improve scan speed or image quality.

DICOM ↩

Digital Imaging and Communications in Medicine, the standard file format and data structure used for medical images and related information.

Homeopathic ↩

Referring to an alternative medicine system based on highly diluted substances, generally not accepted as biologically plausible by mainstream medicine.

LLM ↩

Large language model, a machine learning model trained on large text datasets that can generate and analyze text.

Modality ↩

A specific type of medical imaging method, such as MRI, CT, ultrasound, or X-ray.

MRI ↩

Magnetic Resonance Imaging, a scan that uses strong magnets and radio waves to create detailed images of structures inside the body.

Radiograph ↩

An image produced by X-rays, often used interchangeably with X-ray image.

Radiologist ↩

A physician specialized in interpreting medical images such as X-rays, CT scans, ultrasounds, and MRIs.

Traumeel ↩

A branded product sold in some countries as a homeopathic or botanical medicine, often promoted for pain or inflammation despite disputed clinical benefit.

Ultrasound ↩

An imaging method that uses high-frequency sound waves to visualize soft tissues and organs in real time.

X-ray ↩

A medical imaging method that uses radiation to show dense structures like bones and some calcifications.

Reference links

Medical imaging and radiology references

Nature study on prompt sensitivity in diagnosis
Cited as evidence that subtle prompt changes can alter diagnosis from frontier models.
Radiopaedia article on calcific tendinitis
Shared to explain the shoulder condition under discussion and what imaging can show.
Siemens Healthineers Deep Resolve infographic
Used in the discussion about AI-assisted MRI reconstruction versus diagnosis.
Futurism summary of Stanford mirage reasoning paper
Referenced to argue that frontier models can fabricate image-based reasoning even when they did not truly inspect an image.

Medical AI evidence and adoption

Science paper on AI diagnostic performance
Cited to support the claim that newer reasoning models can match doctors on some text-based diagnostic benchmarks.
CNBC on OpenEvidence physician usage
Used in debate over whether physicians are already using LLM systems in clinical workflows.
OpenEvidence visits overview
Linked to clarify that OpenEvidence is used for both clinical support and visit documentation workflows.
Nature news article on AI use degrading detection rates
Shared to support concerns that physician reliance on AI can degrade performance.
ScienceDirect review on automation bias and decision support
Given as broader evidence that AI assistance can distort human clinical judgment.
Ars Technica on injecting medical misinformation into LLMs
Referenced in the discussion about data poisoning and model reliability in medicine.

Patient stories and healthcare trust

New York Times on trusting AI over a doctor for cancer care
Shared as a cautionary story about AI contributing to delayed cancer treatment.
Fox News story on ChatGPT helping detect cancer
Offered as a counterexample to the New York Times cautionary case.
Today.com story on a mother using ChatGPT for diagnosis
Another media example raised to show that some people report benefits from chatbot-assisted diagnosis.
Physicians Weekly on erosion of trust in healthcare
Used to frame the post as partly a story about declining trust in medical care.

Medication and dementia risk references

MyALZTeam article on Zyrtec and Alzheimer’s concerns
Linked in a side discussion about whether second-generation antihistamines carry dementia risk.
FDA warning on severe itching after stopping long-term cetirizine or levocetirizine
Shared to note a specific withdrawal-related warning for oral allergy medicines.
Harvard Health on dementia risk from antihistamines
Provided as a more grounded summary of uncertain dementia risk from antihistamine use.
Dementia Australia risk factors page
Added to redirect attention toward established dementia risk factors rather than speculative medication effects alone.

Rehab and general reference material

Shoulder mobility drill video
Shared as a practical exercise resource by someone who improved shoulder pain through training.
Wikipedia entry on Gell-Mann Amnesia effect
Used repeatedly to frame why non-experts overtrust plausible but flawed explanations outside their domain.

I used Claude Code to get a second opinion on my MRI

Discussion mood

Key insights

Against the grain

In plain english

Reference links

Medical imaging and radiology references

Medical AI evidence and adoption

Patient stories and healthcare trust

Medication and dementia risk references

Rehab and general reference material