Clinically informed AI outperforms foundation models in spinal cord disease prediction
WashU study shows smaller, targeted models generalize better than large foundation models for early detection of cervical myelopathy
Cervical spondylotic myelopathy (CSM) refers to spinal cord compression from arthritis in the neck and is the leading cause of spinal cord dysfunction in older adults. CSM is a chronic, progressive condition that can cause neck pain, muscle weakness, difficulty walking and other debilitating symptoms. While the diagnosis is sometimes clear, often the diagnosis can take years because symptoms aren’t recognized until the later stages, and by then, treatment options are limited.
A multidisciplinary team of surgeon-scientists, computer scientists and researchers at Washington University in St. Louis developed an artificial intelligence-based approach that could help clinicians screen for and diagnose CSM up to 30 months earlier, opening new opportunities for earlier treatment. The findings were published online Jan. 20, 2026, in npj Digital Medicine.
Salim Yakdan, MD, a postdoctoral research fellow in the Taylor Family Department of Neurosurgery at WashU Medicine, and Ben Warner, a doctoral student in computer science & engineering in the McKelvey School of Engineering, co-first authors on the research, used seven different AI models to analyze large datasets containing electronic health record data of more than 2 million people with and without CSM. The models examined patterns of health care interactions, such as tests and diagnoses, recorded in electronic health records to spot patients whose medical histories resemble those already diagnosed with CSM, helping to flag individuals who may be at higher risk.
Jacob Greenberg, MD, assistant professor of neurosurgery and neurological spine surgeon in the Department of Neurosurgery at WashU Medicine, said CSM is difficult to predict.
“We wanted to know if we could use the information within the electronic health record to try to identify these patients early enough and at a clinically relevant interval where we could potentially intervene as appropriate at an earlier stage to lead to better outcomes,” said Greenberg, co-senior author of the research.
Using both a large external dataset and a smaller dataset from a St. Louis–based health system, the team trained models to predict CSM risk as early as 30 months before a clinical diagnosis, said Warner, who works in the lab of Chenyang Lu, the Fullgraf Professor and director of the AI for Health Institute and co-senior author of the study.
The team evaluated both large foundation models, or “out-of-the-box” systems pretrained on extensive clinical datasets, and smaller, specialized models that incorporate clinical insight and focus only on the most relevant variables.
The foundation models demonstrated superior performance during internal validation on a large, heterogeneous dataset, whereas the smaller, clinically derived model trained from scratch showed better generalizability and more consistent performance across external health care systems, Yakdan said. In contrast, the two mid-scale models underperformed across all evaluated time-horizon estimates.
“We were able to achieve at least comparable, if not superior performance with a much, much simpler model by focusing on existing clinical knowledge while still using a deep learning model,” Greenberg said. “AI clearly has emerging opportunities in medicine, but we often focus only on the areas where purely data-driven solutions excel. There’s still an important role for clinical knowledge, which is going to be true for a lot of applications in health care.”
Lu highlighted the power and efficiency of AI models that incorporate clinical knowledge.
“One of the biggest challenges for AI-based prediction models in clinical medicine is generalizability,” he said. “A model may perform well in one hospital system but fail in others. For complex conditions like CSM, we found that large models trained on millions of patients did not generalize as well as smaller, clinically tailored models. This underscores the importance of embedding clinical insight into AI solutions for healthcare. Clinical knowledge remains essential for developing robust and trustworthy AI tools.”
Yakdan S, Warner B, Ghogawala Z, Ray WZ, Bydon M, Steinmetz MP, Griffey RT, Foraker R, Wilcox A, Lu C, Greenberg JK. Clinically guided models or foundation models? Predicting cervical spondylotic myelopathy from electronic health records. npj Digital Medicine. Published online Jan. 20, 2026. DOI: 10.1038/s41746-026-02337-7.
This research was supported with funding from the U.S. Department of Defense; Washington University/BJC HealthCare Big Ideas Competition; National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health (1K23AR082986-01A1).