Five years after an interlaboratory study showed that more than a hundred forensic laboratories could not agree on a single forensic riddle—and that a majority of them implicated an innocent person in a hypothetical felony, in error—the federal agency who produced the findings has officially published them in a peer-reviewed journal.

“NIST Interlaboratory Studies Involving DNA Mixtures (MIX05 and MIX13): Variation Observed and Lessons Learned” was published open-access online by Forensic Science International: Genetics this week—potentially making the problems with DNA mixture interpretation easier to cite by defense experts, and even prosecutors.

NIST’s scientists write that the MIX13 results in particular have already influenced the DNA forensics community to increasingly use probabilistic genotyping software programs such as TrueAllele and STRmix in their operations. But the peer-reviewed publication would highlight the need for further improvements, the agency added in a statement.

The publication comes after critics had blasted the agency’s wait to publish the paper, as reported by Forensic Magazine in April. The critics, including Greg Hampikian of Boise State University, have argued that though MIX13 may have been generally known in the forensic community, the results were not readily admitted into some courtrooms, because the PowerPoint slides had not been published in a peer-reviewed scientific journal—and judges had thus discounted them for submission in some trials.

An annotated chart included in a PowerPoint about Case 5 of NIST's MIX13 study. (Credit: Courtesy of NIST)

Hampikian and others spoke about some court cases which, they argue, could have been impacted by publication of the MIX05 and MIX13 studies over the last decade.

“Shouldn’t this have been an urgent matter, when it was discovered that the vast majority of laboratories are trying to answer questions that they will answer incorrectly?” said Hampikian.

The new paper shows the results of the two interlaboratory studies—which NIST says have been previously available on the agency’s website.

MIX05 was undertaken in 2005, and asked 69 laboratories to interpret DNA data from two-person mixtures from four hypothetical sex assaults.

MIX13, in 2013, posed five increasingly difficult mock crimes involving up to four contributors and related persons of interest to 108 laboratories.

“Case Five” from MIX13 was the flash point for most critics. A ski mask left behind after a mock bank robbery showed a mixture of touch DNA including four people, but due to its complexity, it initially appeared as a mixture of only two people. The labs were given two of the four likely contributors, along with a fifth person.

But that fifth person was not in the mixture, and had never touched the ski mask. The four-person mixture involving equal amounts of genetic material was made to be difficult. Seventy-four laboratories out of 108 got it wrong by including the fifth person in their interpretation. Most were using the method of combined probability of inclusion, otherwise known as CPI, an FBI-approved method of separating out mixtures.

Twenty-three labs deemed the results “inconclusive” in including the three suspects. Three additional labs found it inconclusive for the fifth, and innocent, person—while still including the other two correctly.

Only seven laboratories got the hard problem totally “right”—by correctly excluding the fifth and innocent person from the four-person mixture. But even the reasons they cited were different. Four of the laboratories cited a missing allele at a key location. Two more, using data from the Identifiler Plus (a ThermoFisher PCR amplification kit), showed that the fifth person could not fit.

A PowerPoint slide showing results from the MIX13 study conducted by NIST researchers in 2013. (Credit: Courtesy of NIST)

John Butler, the special assistant to the director for forensic science in NIST’s Special Programs Office, told Forensic Magazine in an interview in April that “Case Five” and the other mock case histories were not only a way to gauge how labs were doing with their mixtures—they were also to provide a “teaching moment.”

“The mixture itself was designed to not show too many alleles,” Butler said. “People would be tricked into thinking there are only two or three people there, instead of the four people that were really there.

“The way that it was designed was on purpose, to kind of help people realize that CPI can falsely include people—that was its purpose,” he added. “And it demonstrated that really nicely.”

One cannot draw real-world conclusions from Case Five, Butler added in the interview earlier this year.

“We asked specific questions of labs, and part of it was a teaching moment,” he added. “It wasn’t to say, here’s your error rate—because that’s not what the purpose of that was. This was a teaching moment to realize you can falsely include somebody with CPI.”

Hampikian and the other critics took it differently, however.

“Out of the (labs) that got this wrong, don’t you think some of them over the past five years—or even before—were doing the same things with actual casework?” said Hampikian. “Is there any reason to believe they were not doing the same thing with casework?"

NIST writes that the study, although it does not show an error rate, has done its job in alerting the forensic community to shortcomings within many of the systems—and particularly with the CPI mixture-interpretation technique.

“The interlaboratory studies described in this paper were conceived and conducted with the goal of better understanding the ‘lay of the land’ regarding analysis of DNA mixtures at the time,” they write. “Findings from both studies have brought awareness of difference in approaches to DNA mixture interpretation and have highlighted the need for improved training and validation, which have hopefully led to improved protocols over the years.”