What’s Next for Probabilistic Genotyping Software?

  • <<
  • >>

585506.jpg

 

In less than a decade, probabilistic genotyping (PG) software has gone from a new technology to the current, accepted standard for DNA mixture interpretation.

PG software allows forensic labs to use significantly more of the available DNA profile to determine whether a person of interest might have contributed to evidence discovered at a crime scene. As a result, samples previously regarded as too complex or degraded are now being interpreted and used in court proceedings.

That’s not to say that prior methods of DNA interpretation didn’t work, particularly for single source or two-person mixtures. But the application of fixed stochastic thresholds and other biological parameters to manually analyze DNA samples made interpretation slow and tedious. Moreover, the subjectivity inherent in the interpretation of more challenging profiles led to concerns of a lack of consistency between laboratories.

By comparison, PG software is extremely fast, rapidly assessing thousands of proposed profiles almost instantaneously. As a result, numerous forensic labs have made PG software their preferred interpretation method. By running DNA test data through a wide range of probability models and using more of the DNA profile to assign a likelihood ratio (LR) weighed against coincidence, PG software has been used to successfully resolve more than 300,000 cases worldwide. It has proven to be particularly effective in contributing to the resolution of violent crime and sexual assault cases, as well as cold cases in which low-grade or mixture evidence that was originally dismissed as inconclusive can now be reexamined.       

Given that kind of track record, the obvious question is what’s next? In the near term, it is safe to assume that more forensic labs will make PG software part of their DNA interpretation toolkit, although in truth, actual deployment has been slower than anticipated in some labs. This likely has less to do with the software itself and more to do with the heavy caseload and intense pace of work many labs are experiencing, both of which make it difficult to find time for the required installation, internal validation, training, and implementation.

Training can be particularly time-consuming. Forensic analysts must be well-versed in the principles and practices of the software being deployed, in formulating meaningful LR propositions, and reading the diagnostics produced by the software. They must also be trained how to accurately interpret DNA evidence and how to convey the intricacies of that evidence in legal proceedings.

That ability to discuss PG software’s complexities in everyday language with judges, prosecutors, defense attorneys, and jurors is particularly important since challenges in the courts are likely to persist. While PG software has held up well against numerous legal challenges, it is safe to assume challenges will continue, particularly in light of lingering misconceptions about how DNA interpretations are made.

Attorneys, for example, have argued that PG software offers a “black box” approach to DNA analysis, capable of convicting people on the basis of a secret computer code about which little is known. They contend the software does what it has been told to do, influenced by any bugs or biases programmed into it by its less-than-perfect human developers.

PG software has also been mischaracterized as artificial intelligence (AI). A recent report by the Law Commission of Ontario, for example, defines PG software as “the use of AI algorithms to analyze DNA samples collected in police investigations or criminal prosecutions.” That same study questions whether “AI-driven technologies like PG” can meet the high standards of due process, accountability, and transparency demanded by the Canadian Charter of Rights and Freedoms.

Both assertions are wrong and have been refuted by the developers of PG software, as well as the forensic analysts using it, who argue that PG software is not machine learning and makes no decisions on its own. The biological models and mathematical processes of many PG software packages have been published and are publicly accessible. They are available for interrogation by defense experts or other interested parties.

Perhaps such questioning is the nature of anything new. Since short tandem repeat (STR) DNA testing became the mainstay of forensic laboratories, new developments in the analytical phase of DNA testing—among them, improvements to DNA extraction chemistries, the accuracy of quantification kits, and the sensitivity of capillary electrophoresis instruments—have been regularly met by challenges in the scientific community and the judicial system.

Experience also suggests that PG software will continue to be improved, with creation of a continuum that completes the full workflow from analysis to interpretation and database matching the next logical step. Given forensic labs’ pressing need to reduce backlogs by doing more analyses faster and more accurately, that next step is likely to involve applications that simplify the profile analysis and enable extremely fast database searches and LR calculations.

At least one PG software package has already introduced an application that expedites the analysis of raw data generated by genetic analyzers and standard profiling kits by combining an intuitive, user-friendly graphical interface with easily understandable and laboratory-customizable rules. In addition to rapid DNA profile analysis, this application can accurately assign an estimate of the number of contributors and is fully integrated with its sister PG software. 

A second commercial application uses efficient algorithms to get more value from DNA evidence by visualizing the value of DNA mixture evidence, undertaking mixture-to-mixture comparisons, and calculating millions of LRs in seconds. Users can also assign LRs for paternity and kinship scenarios, including complex pedigrees involving incest. Recent improvements to the kinship functionality allow the user to consider mutation, co-ancestry, and linkage. It also provides increased accessibility since it can run on a user’s PC.

All of this brings us back to the basic question: what’s next? Undoubtedly, challenges will remain. Beyond anticipated legal challenges, reports such as a recent National Institute of Standards and Technology review will continue to insist that “there is not enough publicly available data to enable an external and independent assessment of the degree of reliability of DNA mixture interpretation practices, including the use of probabilistic genotyping software systems.” 

While challenges will continue, so too will improvements designed to increase the speed and simplicity of analyses, and the accuracy and reliability of the results those analyses generate. As the forensic community embraces new analytical techniques, such as massively parallel sequencing (also referred to as “next generation sequencing”), PG software developers will need to formulate suitable models for interpreting such data. PG software for the interpretation of Y-STR (male specific) profiles, which are routinely developed during sexual assault investigations, would also be a welcome advancement.

As more forensic labs use PG software and the software itself continues to be refined, it undoubtedly will have an even greater impact on criminal and civil investigations, providing well-founded data from a broader range of DNA evidence. To make certain that progress continues, it will be incumbent on developers to fine-tune their work while addressing issues which arise and have merit. For their part, forensic labs need to ensure that their scientists receive extensive, ongoing training and properly validate their software, while implementing effective protocols to present the strength of PG results accurately and effectively.

 About the author: Zane Kerr is a senior scientist with the STRmix team of ESR, New Zealand, and a senior scientific officer with the Forensic and Analytical Science Service in Sydney, Australia. Kerr has a wealth of casework experience in the fields of forensic biology and DNA analysis, including more than 10 years of experience as a court reporting officer. He also has a keen interest in training and education, and has been instrumental in expanding the eLearning course catalogue offered by the STRmix team.

 

Subscribe to our e-Newsletters
Stay up to date with the latest news, articles, and products for the lab. Plus, get special offers from Forensic – all delivered right to your inbox! Sign up now!