Genomic research has led to one of medicine’s golden eras, a time of discovering gene variants that lead to disease and ushering in powerful, tailored treatments. But this progress largely comes from research on people of European ancestry, who account for about 80% of participants in genomic studies, though they only make up 16% of the global population. Those medical miracles, then, come with a dangerous asterisk for most of the planet.

That problem may be headed for a correction. Genomes with significant non-European ancestry have often been discarded in the search for genetic variants of medical interest, because of longstanding protocols. But through new techniques and programs, and the passion of a new generation of researchers, those excluded genomes are now being gathered, analyzed and integrated, creating insights that benefit people of all races.

Including data only from people with European heritage has become almost routine. Simon Gravel, assistant professor of human genetics at McGill University in Montreal, published a 2020 comment in Nature in which he and his colleagues looked at several dozen genome-wide association studies (GWAS) that used data from the UK Biobank or the U.S. Health and Retirement Study, a longitudinal study following the health of 20,000 adults over age 50. Of the 58 GWAS Gravel and his colleagues considered, 45 excluded data from participants whose ancestry was non-European.

Many study authors gave no reason for this, according to Gravel’s research. Among those who did, some cited concerns about having too few study participants of non-European ancestry to get meaningful results, while others wrote that they had followed procedures established in prior studies whose methods relied on European ancestry data. Gravel noted that such routine exclusion must inevitably lead to long-term effects: “Excluding or not including populations for any reason leads to suboptimal care in the under-studied population.”

This knock-on damage from being excluded has been borne out in other research. Consider a genetics-based tool created in 2008 that can help determine an optimal dosage of warfarin, a drug used to prevent drug clots. Later studies found that the tool was less effective for Black people. Polygenic risk scores, which use a constellation of genes to predict risk of cancer, high cholesterol levels and other diseases, also work less well for those with non-European ancestry, according to a large 2019 study.

One structural reason for ignoring data from non-Europeans is the sheer number of records needed to conduct meaningful GWAS research. Thousands, if not tens of thousands, of participants are needed, and because most early studies focused on people of European ancestry, data on non-Europeans was in short supply. Clinical tools developed in the course of that work didn’t map onto non-European populations as well, says Alicia Martin, a researcher in the Analytic and Translational Genetics Unit of Massachusetts General Hospital and a scientist at the Broad Institute of MIT and Harvard. “That’s a huge problem,” she says.

One avenue into broadening current research is to look at genomes of people whose ancestors came from multiple continents, including Europe, and find ways to incorporate their “admixed” data into studies. In 2021, Nature Genetics published a new statistical framework and software package created by Elizabeth Atkinson, assistant professor in the molecular and human genetics department at Baylor College of Medicine. It flags genomic data in a way that shows which stretches of DNA belong to which ancestral origins and allows scientists to analyze such admixed genomes alongside more homogenous ones. This method can also identify which gene variants might be particularly influential for certain traits in people of varied ancestries.

“Everybody’s genomes are going to be a mosaic—chunks of chromosomes from different ancestries,” says Atkinson. With newer data methods like hers, “you don’t need to assign somebody a single ancestry label.”

Even without improved analytical tools, some research teams are publishing research with non-European ancestry data, despite the smaller sample sizes. “These datasets can still provide the scientific community with valuable statistics and findings,” says Gravel.

The Pan-UK Biobank initiative, from researchers at Massachusetts General Hospital and the Broad Institute, has done some of this legwork, analyzing all DNA from both non-European and European ancestry participants in the UK Biobank. They’ve published analyses of genetic associations with thousands of traits or phenotypes on a public website, enabling other scientists to easily use information from underrepresented ancestry groups.

“We wanted to facilitate research on diverse ancestries more readily,” says Atkinson, who is participating in the initiative with Martin and other researchers. This new data has already revealed some chromosome locations that vary from European ancestry samples. “By including more diverse ancestries, we have more opportunities to discover genetic loci that influence health for everybody,” she says.

At Yale University, Janitza Montalvo-Ortiz, assistant professor in the environmental psychiatry division of human genetics, has been working with the Latin American genomes often cast aside in GWAS. She and her colleagues have been combining data from Latin America, the United States and the UK Biobank. Montalvo-Ortiz cofounded the Latin American Genomics Consortium, of which Atkinson is a coleader, and the group is aggregating 100,000 DNA samples from Latin Americans to find gene variants associated with psychiatric conditions. The researchers are currently looking for genes that influence smoking and alcohol dependence.

The work poses challenges. One major hurdle for Montalvo-Ortiz’s team is cobbling together a wide range of data spanning two continents and then soliciting the access and permissions needed for such a shared resource to be useful to other research teams. For now, the scientists are doing the work on their own time, unsupported by grants. But the team is driven by a passionate conviction. “We want to provide accurate tools for doctors and prescribers that could benefit people like us,” says Montalvo-Ortiz.