[2603.00221] A medical coding language model trained on clinical narratives from a population-wide cohort of 1.8 million patients

[2603.00221] A medical coding language model trained on clinical narratives from a population-wide cohort of 1.8 million patients

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2603.00221: A medical coding language model trained on clinical narratives from a population-wide cohort of 1.8 million patients

Computer Science > Machine Learning arXiv:2603.00221 (cs) [Submitted on 27 Feb 2026] Title:A medical coding language model trained on clinical narratives from a population-wide cohort of 1.8 million patients Authors:Joakim Edin, Sedrah Butt Balaganeshan, Annike Kjølby Kristensen, Lars Maaløe, Ioannis Louloudis, Søren Brunak View a PDF of the paper titled A medical coding language model trained on clinical narratives from a population-wide cohort of 1.8 million patients, by Joakim Edin and 5 other authors View PDF Abstract:Medical coding translates clinical documentation into standardized codes for billing, research, and public health, but manual coding is time-consuming and error-prone. Existing automation efforts rely on small datasets that poorly represent real-world patient heterogeneity. We trained a language model on 5.8 million electronic health records from 1.8 million patients across nearly all specialties in Eastern Denmark (2006--2016) to predict ICD-10 codes from clinical notes, medications, and laboratory results. Evaluated on 270,000 held-out patients, the model achieved a micro F1 of 71.8% and a top-10 recall of 95.5%. Performance varied by specialty (F1: 53--91%), with higher scores in specialties with well-defined diagnostic criteria. Codes appearing predominantly as secondary diagnoses had markedly lower F1 scores. For three such codes (suicide-related behaviors, weight disorders, and hypertension), the model identified thousands of uncoded cases, of which...

Originally published on March 03, 2026. Curated by AI News.

Related Articles

Llms

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I published a paper today on something I've been calling postural manipulation. The short version: ordi...

Reddit - Artificial Intelligence · 1 min ·
Llms

[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I've been documenting what I'm calling postural manipulation: a specific class of language that install...

Reddit - Machine Learning · 1 min ·
Llms

What does Gemini think of you?

I noticed that Gemini was referring back to a lot of queries I've made in the past and was using that knowledge to drive follow up prompt...

Reddit - Artificial Intelligence · 1 min ·
Llms

This app helps you see what LLMs you can run on your hardware

submitted by /u/dev_is_active [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime