The past decade has seen machine learning—finding patterns in vast piles of data—in a state of vibrant growth. The number of life science papers on machine learning numbered just under 600 in 2010; by 2019, there were more than 12,000. The applications in medicine are potentially lifesaving and include the ability to help physicians home in more quickly on the right diagnosis (“Doctors in the Machine,” Winter 2015).

The FDA has already approved 29 medical devices that use machine learning in some way, with dozens of others in the pipeline. Translational research teams are also looking to find room in clinical practice for an astonishing range of its insights, including which patients are most likely to miss an insulin dose and who might attempt suicide in the next six months.

But there is a downside. Human researchers are, by and large, unable to follow the logic behind many of these algorithms—including almost all of those used in FDA-approved technologies. The insights are created by passing the information through “hidden layers” of complex networks to develop predictive patterns—a black box approach where the logic becomes opaque.

“To say that an algorithm is a black box means that it wouldn’t be interpretable even by the people who designed it,” says Boris Babic, a professor of philosophy and statistics at the University of Toronto. The parameters and their relationships become so complicated, he says, that it becomes mathematically impossible to piece together how the inputs lead to the outputs.

Some might argue: So what? If the algorithms have predictive power, then let the black box be black. But others are concerned about dangerous assumptions that machines might cook up or “catch” from the data they import. Where a tool is learning from human example, for instance, it might perpetuate existing biases—clinicians’ tendencies to take women’s accounts of pain less seriously than men’s, for instance.

Researchers and policymakers have increasingly called for algorithms that can explain what they’re doing. The U.S. National Institute of Standards and Technology held a workshop earlier this year to lay out new benchmarks. The Royal Society issued a policy brief in favor of explanations, and the European Union, after it passed the General Data Protection Regulation in 2016, has increasingly advocated a “right to explanation” about the algorithms that affect people’s lives.

Will explanations of an algorithm really solve problems of bias and unintended consequences? “This notion of requiring an explanation is typically pitched as a way to allay ethical and legal concerns,” says Babic. “But it’s not going to be an effective one.”

In a recent editorial in Science, Babic and his colleagues—including legal researchers and computer scientists—issued a warning about so-called “explainable AI.” Efforts to increase transparency run the risk, at a minimum, of misleading an audience and potentially undermining these programs’ predictive power—their chief advantage over human analysis.

“Researchers pursue explainability in two primary tracks,” Babic explains. One is through “interpretable AI,” which focuses on developing new models for prediction based only on steps and connections that can be reported in an understandable way by the program. While this introduces constraints, in some situations, he says, it’s possible that an interpretable program would equal or exceed a black box in accuracy. This would not be true in all cases, however.

“The other track,” Babic says, “is more worrisome.” It involves designing a separate algorithm that closely approximates the findings of a black box—except that this second program employs only functions that users can understand. It finds a pattern in the data to fit the conclusions. While comforting, Babic says, these approximations are far from being the same as the real thing: “A bad explanation is worse than no explanation at all.”

Some scientists do see a potential limited role for such post hoc models. Matt Turek, who leads an explainable AI program at DARPA, the research and development arm of the Department of Defense, can imagine situations where a provisional explanation for a complex prediction would be useful, as long as users were warned about its limitations. “I think about it in terms of a spectrum of explanation technologies,” he says.

Turek also points to ongoing “interpretable AI” efforts at universities across the country to develop complex algorithms that can, in some sense, explain what they’re doing. A group at the University of California, Berkeley, for example, in 2016 developed a type of “black box” algorithm that could produce text-based explanations of why it made its choices.

And the current roadblocks could even lead to improvements. “In some cases, we’ve found that by using explainable AI techniques, we improved past performance,” he says, explaining that certain algorithms seem to “get better” when they’re forced to explain themselves.

Medicine is not the only field grappling with this issue—AI has also been pulled into the sensitive fields of criminal justice, self-driving cars and predicting who is likely to repay a loan. In all of these cases, people can be harmed by AI. “We need explanations—especially in domains where serious outcomes happen when things go wrong,” Turek says.