ALTHOUGH THE NUMBER OF NEW CASES OF DIABETES HAS FALLEN MODESTLY IN RECENT YERS, this decline comes only after a period of explosive growth. While 5.5 million Americans had been diagnosed with diabetes in 1980, that number had reached 22 million in 2014, according to the Centers for Disease Control and Prevention. And each new case puts a patient at risk of serious complications.

Diabetes is widely known for elevating glucose levels in the blood. But it can lead to many serious secondary diseases and conditions, known as comorbidities. Heart disease and kidney disease are common, and more than half of those with diabetes suffer some degree of nerve damage, which can lead to infected injuries and amputations.

It can be difficult to know which patients are at the highest risk for these complications. Physicians can use mathematical formulas called outcomes models as a sort of crystal ball. The UKPDS risk engine, for instance, can be used to calculate a patient’s chances of developing coronary artery disease. Input a patient’s age, time since diabetes diagnosis, cholesterol levels and other variables into the model, and it will estimate the risk that the patient will develop coronary artery disease within the next 10 years.

But creating such a tool has historically been time and labor intensive. The initial UKPDS model took 20 years and 5,000 patients to develop. With the advent of electronic health records (EHRs) and new tools for mining the data in those records, however, it should be possible to build more comprehensive prediction models more quickly—if researchers can perfect them.

A team at Duke University led by electrical and computer engineer Ricardo Henao, for instance, fed a computer EHR information from approximately 17,000 diabetes patients in the Duke University Health System. Using “deep models”—programs that ask computers to look for patterns among thousands of examples—researchers looked for new correlations. The resulting model can make more accurate predictions than the UKPDS risk calculator provides, researchers say, and it can predict whether the patients will develop any one of a number of diabetes comorbidities.

This new model can project whether a patient will require amputation within a year with almost 90% accuracy, and can correctly predict the risks of coronary artery disease, heart failure and kidney disease in four out of five cases. The model looks at what was typed into a patient’s chart—diagnosis codes, medications, laboratory tests—and picks up on which pieces of information in the EHR are correlated with the development of a comorbidity in the following year.

The Duke researchers hope to go a step further: discovering previously unknown predictors of comorbidities. “The advantage of our model is that you’re not tied to a single hypothesis,” says James Lu, a physician and scientist involved in the Duke project and co-founder of Helix, a personal genome company in California. “It might discover things that you might not have intuited,” Lu says.

But relying exclusively on EHR data has several drawbacks, according to Lu. A health record might be incomplete, leaving out a treatment or comorbidities because of a diagnostic or clerical error. An EHR might also offer less than the whole health history of a patient—if, for example, a patient hasn’t had consistent access to a doctor or if for some reason health records hadn’t been fully passed along between institutions.

“These kinds of machine learning algorithms offer an enormous promise to transform medicine as we know it,” says Michael Pencina, a professor of biostatistics and bioinformatics at Duke, who was not involved in the project. “But with large amounts of data, a machine learning model may identify correlations that don’t have biological or clinical meaning.” The Duke team aims to improve the model by expanding its use to diabetic patients beyond Durham, N.C., to see whether the initial findings hold.

One problem with this approach—and one reason that studies using EHRs remain relatively rare—is the difficulty of gaining access to medical records for research. To use those records for the Duke study, researchers worked closely with the hospital’s institutional review board—an ethics committee that reviews research involving humans—and ended up contacting all 17,000 patients for their informed consent.

But if the Duke diabetes model proves a success, it could pave the way to more cooperation between hospitals and researchers—and more models for catching disease before it does its worst.