ON A SEPTEMBER NIGHT in 2004, in the woods on the outskirts of Boston, a 19-year-old girl later known as J.G. in court documents tucked a rumpled condom into her bra. She had been the victim of sexual assault only minutes before, and one of her two assailants had used the condom during the attack. J.G. knew enough about crime scene investigation to understand that if she got the condom to the police, it might help identify her assailant.

She was at least half right. Genetic material found on the condom was matched in 2008 to the DNA of a suspect in another case. But there was a hitch: The suspect had a twin brother. The two are monozygotic twins, with identical DNA. At the time, forensic labs had no way of telling their DNA apart, to determine which of the twins had allegedly used the condom. It’s a mystery that might have remained unsolved but for a revolution taking place in forensic genetics, a field that uses DNA evidence to advance criminal investigations.

For the past 20 years, forensic laboratories that examine crime scene DNA have relied almost exclusively on a technique called short tandem repeat (STR) analysis. This approach identifies repeating sequences of nucleotides, which are the building blocks of DNA molecules. Individuals can be distinguished based on the number of repeating segments they possess across various regions of their DNA molecules. Finding a match is a little bit like comparing two books by counting the number of times a particular word, such as very, repeats in a particular sentence on the same page of both books.


An alternate method of DNA analysis was developed in the 1970s, when Nobel Prize–winning biochemist Frederick Sanger pioneered the DNA sequencing techniques that eventually led to the groundbreaking Human Genome Project. But the cost and time required for that approach put it off limits for most forensic labs. Then, in 2005, a new technology called next-generation sequencing, or NGS, emerged. It can distill vast amounts of genetic sequence data from tiny amounts of biological material. If STR analysis is akin to flagging repeated words in one sentence in a book, next-generation sequencing is like identifying every single letter of every word in the entire book.

During the past decade, the amount of information NGS technologies can produce about an individual genome has exploded, while cost has collapsed. Whereas it took 15 years and $3 billion to sequence the first human genome, newer technology released in 2014 can sequence more than 45 human genomes in one day for approximately $1,000 each. Further advances in the past couple of years now enable scientists to make targeted NGS explorations of specific genetic regions—say, the part of DNA that is responsible for eye color or ancestry.

Each aspect of this evolution in DNA analysis has also opened up possibilities for forensic labs. As costs have come down and targeted tests have become possible, a number of biotechnology companies have started developing forensic kits that will allow crime labs to run NGS tests on their samples. The data they collect can help the criminal justice system move forward in formerly impossible cases—including the rape case of J.G., which is the first criminal case in the United States in which the next-generation sequencing of nuclear genetic material has been proposed as evidence in a criminal trial.

DNA is built from so-called base pairs, the complementary nucleotides adenine, thymine, guanine and cytosine, with each pair making up a single step in the ladder of DNA’s twisted double helix. Human DNA contains more than 3 billion of these nucleotides, arranged in discrete sequences that make up the approximately 19,000 genes that carry the codes required to make and direct a life. (Genes comprise only part of DNA, which also contains large amounts of noncoding genetic material.) DNA strands, in turn, are packed into 23 pairs of chromosomes residing in the nuclei of human cells.

The entire mystery of evolution is driven by mutations to this DNA. As cells are copied, mistakes may result in permanent changes, and environmental factors can also modify the code. Mutations may alter a single nucleotide—known as single nucleotide polymorphisms, or SNPs—or a larger piece of the chromosome.

Next-generation sequencing can see the individual nucleotide letters in DNA, which means that it can also pick up on these tiny SNP mutations. Researchers had long theorized that the DNA of identical twins might harbor such tiny differences, rare SNP mutations that occurred either in the womb, after a fertilized egg split into two identical embryos, or perhaps even after birth. But until a couple of years ago, no one had been able to identify such mutations.

In April 2014, as assistant district attorney David A. Deakin prepared to meet a Commonwealth of Massachusetts DNA expert about J.G.’s case, he learned about a breakthrough by Eurofins, a German genomics testing firm, that let scientists detect such genetic differences between identical twins. Eurofins had sequenced the genomes of a pair of identical twins from their sperm samples using “ultra deep” next-generation sequencing tools. Researchers found five SNPs, among the roughly 10 million nucleotides they identified, that differed between the two. The researchers laid out their methodology in a paper published in March 2014 in Forensic Science International: Genetics. Deakin got Eurofins to agree to conduct testing in J.G.’s case, and then petitioned the court to delay an upcoming trial so that the evidence could be submitted. A hearing to determine whether the Eurofins evidence is admissible is still pending.


DNA ANALYSIS HAS LONG been considered the gold standard in forensics—a field that has come under fire on other fronts for lacking scientific rigor. Many approaches pioneered by forensic scientists—hair analysis, fingerprinting, techniques of arson investigation—have been found to rely on methodologies that are highly subject to bias if not totally fraudulent. Last year, for instance, the FBI admitted that in more than 95% of cases of hair analysis, analysts overstated their findings to favor prosecutors.

DNA forensics, too, has taken heat from critics who say that current STR methods are vulnerable to bias and guesswork. But when high-quality samples of genetic material are available—and when investigators are very certain about how the DNA came to be at the crime scene—STR analysis has an objectively high rate of accuracy. For example, if a sample of DNA is retrieved from a trail of blood leading away from the scene or from a weapon that was clearly used against the victim, testing can be conclusive, with only about one chance in a billion of a sample matching the DNA of someone who wasn’t there.

Crime scene DNA is typically compared to that of suspects in the case or to the more than 14 million DNA profiles, collected since the 1990s, that make up a federal database known as the Combined DNA Index System, or CODIS. Those profiles include people who have been convicted of certain crimes as well as some who have been arrested. All 50 states require convicted felons to provide DNA samples, while another 29 require DNA collection upon arrest for certain felonies.

The problem is that crime scene DNA often contains complex mixtures, which can occur when there’s more than one perpetrator, or when trace amounts of so-called touch DNA—samples that may contain genetic material deposited days before the crime occurred—are recovered from the scene of the crime.

So while it works well in many cases, this use of STR analysis has inherent shortcomings. The method can’t differentiate between identical twins. And it has limitations for extracting information from DNA samples that are small, degraded or mixed—all common problems in forensics. With partial and mixed samples, it’s difficult to retrieve enough data for a high-quality STR profile. Weak samples also increase the chance that someone will be identified in error. (The work of the Innocence Project, which aims to free people who have been wrongly convicted, has highlighted the prevalence of such problems.)

Indeed, tens of thousands of forensic cases are currently hamstrung because of mixtures of DNA that can’t be separated with the most widely used technologies, according to Ted Hunt, chief trial attorney and DNA cold case project director in the Jackson County Prosecutor’s Office in Kansas City, Mo., and a member of the U.S. Department of Justice’s National Commission on Forensic Science. “That’s the problem today—how to get more and more information from less and less genetic material.”

AS PART OF THE SOLUTION, next-generation sequencing can sometimes work wonders. “You are able to look at discrete chains of nucleotides from different contributors in low-level and degraded DNA and separate them,” Hunt says. For example, new NGS applications from Illumina, a biotechnology firm in San Diego that is a powerhouse in next-generation sequencing research, can work out the relative concentrations of DNA from different contributors—30% from person A, 20% from person B and 50% from person C. That lets technicians tease out distinct individual DNA profiles. The California Department of Justice and the Department of Defense are now evaluating Illumina’s technology for use, says Hunt.

NGS technologies can also help investigators get more meaningful results from small or damaged DNA samples. Unlike STR analysis, which is typically limited to examining 20 to 25 regions of the DNA molecule, the NGS technology can look at more than 200 regions in a single pass, and potentially many more. That helps in getting more information from a limited or compromised sample, according to Richard Guerrieri, forensic DNA research leader at Battelle, a large nonprofit research and development organization that manages several U.S. government laboratories. Guerrieri has been in the field of forensic science since 1981 and spent almost two decades doing forensic DNA work with the FBI.


Yet another potential benefit of NGS comes when investigators find no matches for the DNA at a crime scene. Until recently, if a CODIS database search turned up nothing, that was pretty much that. But now it is sometimes possible to create a kind of digital “wanted” poster with NGS by extracting information from the crime scene sample that points to a suspect’s phenotype—skin, hair and eye color, ancestry, shape of face. In this fairly new area of genetic research, scientists seek to identify specific locations on the genome that are associated with particular three-dimensional approximations of facial features, such as the tip of the nose or the distance between the eyes.

One pioneer is Parabon Nanolabs, which has created a program called Snapshot that can create a visual profile of a suspect from crime scene DNA. Using a database of phenotypes and genotypes collected from more than 10,000 people around the globe, Parabon has identified the genetic markers that correspond to certain physical traits common to people of a particular ancestry.

“There are cases where investigators might have a suspect list that’s incredibly long—hundreds of people,” says Steven Armentrout, CEO of Parabon. “It might be everyone who used a certain cell phone tower during a period of time. With Snapshot, we might be able to say, ‘Look, this person has blue eyes’ with 80% confidence. We might be able to say with 99% confidence, ‘They don’t have black eyes, brown eyes or hazel eyes.’ As investigators look at a list of suspects, this gives them very strong exclusionary information that can help them prioritize their investigation.”

Police departments are already using Snapshot reports to rule out some potential suspects and search for others, and cases in which the technology has been used are making their way through the court system. But currently the technology is used only as an investigative tool, for narrowing suspect lists or generating new leads.

Perhaps the most fascinating area in which NGS DNA analysis could help is the “typing” of tissue or body fluids. Knowing the source of DNA swabbed from a crime scene could be used to uncover not just the who, but the what and the when of crimes. Being able to differentiate between menstrual blood and blood from trauma, for example, could help reconstruct what happened in sexual assault cases. Or it could help detectives determine that the DNA recovered from a victim’s shoulder came not from saliva from a bite mark but rather from skin cells left when a friendly neighbor hugged her some time well before a crime occurred, says Peter Gunn, professor at the Centre for Forensic Science at the University of Technology Sydney in Australia. A paper he wrote, in the March 2014 issue of the journal Frontiers in Genetics, examines how advances in molecular biology will expand the kinds of questions forensic investigators can ask of biological material.

To perform tissue typing, most forensic labs currently use tests that look for enzymes or antibodies associated with certain tissue types. But those results aren’t always definitive and sometimes can’t identify saliva, skin, vaginal secretions or menstrual blood. NGS technology, in contrast, can quantify tissue-specific genes in biological stains that are up to two years old.


AS IS OFTEN THE CASE with frontier technologies, the military has led the way on NGS. The very first forensic lab in the United States to use NGS protocol—an established, scientifically validated procedural design—was the Armed Forces Medical Examiner’s Armed Forces DNA Identification Laboratory (AFMES-AFDIL) in Dover, Del. In 2011, according to Timothy McMahon, deputy director of forensic sciences for AFMES-AFDIL, the lab invested in new NGS instruments in an attempt to advance a decades-old project: DNA-testing the remains of hundreds of American soldiers killed and unaccounted for in the Korean War.

The bodies of the service members, returned to the United States in the early 1950s, had been interred at the National Memorial Cemetery of the Pacific in Hawaii. AFDIL later recovered the remains and was able to identify many of them, but 859 could not be matched to DNA profiles supplied by relatives of the missing. It didn’t help that the bodies had been preserved in formaldehyde, which is highly destructive to DNA.

NGS succeeded, however, in doing what had been presumed impossible. “This is allowing us to make identifications in which the DNA has been chemically treated and highly damaged,” says McMahon. Related techniques could likely be applied to missing persons cases and humanitarian efforts to identify victims of mass disasters, such as tsunamis, in which recovered samples of DNA are typically degraded and difficult to analyze.

Despite NGS’s promise, it will take a few years before the broader forensics industry and the court system are ready for it. “When you introduce a change, a lot of T’s have to be crossed before you can implement it,” says McMahon. “There’s training, learning the technology, understanding the limits of the tests, acquiring new instruments, redesigning your laboratory and managing the information.” The FBI and other organizations that accredit laboratories will also require them to validate the techniques they use and show a high level of proficiency.

But government and law enforcement agencies are making an effort to push NGS forward. AFMES-AFDIL is one of several institutions participating in a National Institute of Justice–funded program to evaluate how NGS might be used in forensic laboratories. AFMES-AFDIL was asked by Battelle to partner with the National Institute of Standards and Technology; the Bureau of Alcohol, Tobacco, Firearms and Explosives; the California Department of Justice; a Texas county lab; and the Office of the Chief Medical Examiner of the City of New York.

And J.G.’s case should speed the pace, giving investigators and prosecutors greater confidence that NGS tools can solve tough forensic problems and convincing courts and judges that the science is sound. “There are untold thousands of cases across this country that currently cannot be solved with conventional DNA technology,” says Hunt. “They are waiting for something like next-generation sequencing to come along.”