DNA Matters: Why Forensic Genotypes are Probabilistic

 DNA Matters: Why Forensic Genotypes are Probabilistic

DNA Matters is Forensic's newest column, which will discuss cases that have been aided by the power of computer software in DNA analysis. It is authored by Dr. Mark Perlin, M.D., Ph.D., chief scientist, executive and founder at Cybergenetics. Twenty years ago, Perlin invented TrueAllele probabilistic genotyping for automated human identification from DNA mixtures. His company helped identify victim remains in the World Trade Center disaster, and has helped exonerate 10 innocent men. He is a Scholar in Residence at Duquesne University’s Forensic Science and Law program, and a Fellow of the American Academy of Forensic Sciences. 

On June 15, 2011, a woman in Elmira, New York awoke at 4 am to find a knife-wielding man in her apartment. The intruder forced her onto her bed and then raped her. From his voice and height, she recognized the rapist as Casey Wilson.

On Sept. 29, 2013, another Elmira woman was awakened in her apartment at 7 am by a man with a knife. He grabbed her face from behind, forced her upstairs, and raped her. She also recognized him—by his voice and height—as Casey Wilson. 

Police recovered a pair of purple gloves from a park near the crime scene, consistent with video footage of a masked cyclist leaving the scene. The New York State Police (NYSP) crime laboratory swabbed the insides of the purple gloves. They found DNA mixtures of three people. The DNA on the purple gloves came from the victim, her boyfriend and rapist Casey Wilson. The genotypes of all three people were present on each purple glove. But forensic DNA science would be needed to reveal those identities. 

The people on the gloves left very small quantities of their DNA. To detect this miniscule DNA amount, the crime lab had to amplify the glove genotypes. They used polymerase chain reaction (PCR) at 15 short tandem repeat (STR) genetic loci. The PCR lab process amplified the alleles in the genotypes of Wilson and the others. From a few dozen Wilson cells in the lab’s test tubes, PCR synthesized his alleles into billions of fluorescently labeled DNA fragments.

But PCR is a not always a faithful DNA amplifier. At each copying step, there is a chance that DNA synthesis may fail. PCR is a random copying process that injects variation into the number of alleles detected on an electropherogram (EPG) readout. The proportions of EPG alleles differed from the DNA amounts Wilson and the others had left on the purple gloves. This artificial PCR allele distortion is why STR genotyping data are “probabilistic.” 

Accurate genotyping software figures out how much the PCR distorts an evidence item’s DNA. From the EPG data, the computer calculates a statistical variance that describes the allele distortion. It then uses this computed variance in its probability model to quantify genotype uncertainty. With more PCR variation, there is more uncertainty. Having more genotypes that can explain the STR data means less identification information, as measured by lower DNA likelihood ratio (LR) match statistics1

Cybergenetics TrueAllele computing unmixed the purple glove DNA mixtures into probabilistic genotypes. Comparing Wilson’s genotypes from the gloves with his known reference, the computer connected him to the crime. The LR match statistics were 817 thousand for the left glove, and 31.3 million for the right glove. The computer identified suspect Wilson as a 15% minor contributor to the DNA mixtures.

The computer’s LR match statistics also placed the victim and her boyfriend’s DNA on the gloves. Identifying those three people in each mixture made the abandoned gloves probative evidence. If Wilson was not the rapist, how did the victims’ DNA get on his gloves?

I testified about the DNA match statistics before a Chemung County grand jury on Dec. 19, 2013, and again at Wilson's trial on Sept. 11, 2014. Wilson was found guilty of two counts each of first-degree rape, first-degree burglary, and other charges. He was sentenced to 25 years in prison for the 2011 and 2013 crimes, and 15 years for burglary.

On March 26, 2019, the New York trial court held a rare post-trial Frye hearing on the admissibility of TrueAllele evidence in the Wilson matter. The court found that “the TrueAllele Casework system, as of 2013, was reliable and generally accepted within the relevant scientific community.”  

Older human review methods cannot handle DNA mixtures2. They usually give inaccurate match statistics, or no answer at all. For the left purple glove in this serial rape case, the crime lab reported that “due to insufficient genetic information, no comparisons were made to the minor contributors to this profile.” And, for the right glove, “due to the complexity of the genetic information, no comparisons were made to this profile.” The forensic lab found no information.

Limited manual interpretation has failed crime labs in hundred of thousands of DNA mixtures. The information loss is just the tip of the DNA iceberg in America. Powerful evidence that could implicate the guilty, or exonerate the innocent3, sits unused in crime lab data files. Perhaps, one day, automated probabilistic genotyping may shed light on these old cases, with computers bringing better justice through better science.

References

1. Perlin, M.W., Legler, M.M., Spencer, C.E., Smith, J.L., Allan, W.P., Belrose, J.L., and Duceman, B.W. Validating TrueAllele® DNA mixture interpretation. Journal of Forensic Sciences, 56(6):1430-47, 2011.
2. Perlin, M.W. When DNA is not a gold standard: failing to interpret mixture evidence. The Champion, 42(4):50-56, May 2018.
3. Perlin, M.W. Hidden DNA evidence: exonerating the innocent. Forensic Magazine, 15(1):10-12, 2018.

Related Categories