Global DNA Databases Expand Loci Sets to Optimize Capabilities
Nearly 18 years have passed since the United Kingdom’s Forensic Science Service announced it had started the world’s first nationwide DNA database of criminal offenders. The UK database program became an instant crime fighting success and by the turn of the millennium, the United States, Australia, New Zealand, and multiple countries in Western Europe had also moved forward with establishing national DNA database programs.
Like the United Kingdom, most of these early adopters built their programs primarily to solve crimes within their own borders, using limited loci and with limited focus on international loci interoperability. To compound this, they designed their DNA databases to search relatively small subsets of the criminal populations, which negated the need for larger loci sets.
As these countries implemented their programs, forensic scientists and police officials began to question why crime scene samples were not also being searched internationally, especially in those countries within the same region. Several studies have shown that cross comparing international databases significantly increase hit rates in certain circumstances. However, the lack of international loci standards among the early adopters made international sharing onerous, as most of the programs were not compatible.
The European Union—where criminals were able to move freely across borders to commit crimes—was the first government agency that aimed to resolve this issue. In 2005, it passed the Prüm Treaty that eventually resulted in the establishment of infrastructure to support an EU-wide network where all countries could exchange data obtained by law enforcement officers, including DNA. In tandem with this effort, the European Network of Forensic Science Institutes (ENFSI) group made recommendations to member countries to utilize an expanded set of loci for DNA testing in an effort to minimize adventitious matches.1,2
As the established countries continued to expand the tools needed to increase the utility of their DNA databases, additional countries pursuing national DNA database programs began to emerge in other regions, such as Eastern Europe, Asia, Latin America, and the Middle East. Fortunately, these new regions applied the lessons learned from the Prüm Treaty implementation and included larger loci sets as a requirement prior to establishing their programs. In many of these regions, representatives from DNA laboratories have worked together to determine loci standardization recommendations to ensure they create databases that can share information seamlessly within their region.
To date, forty-four countries have implemented nationwide criminal offender DNA database programs (Figure 1). The number of offender samples throughout the world is now over 40 million, and is expanding rapidly. Even with many countries in the world utilizing well established DNA databases, a significant increase of global samples is yet to come. By 2020 it is estimated that 75 countries will have national DNA database programs and many of the existing programs will be expanding legislative mandates to collect DNA from all convicted offenders and arrestees.3 As these expansions are made, it is possible that world-wide offender samples could number in the hundreds of millions.
www.dnaresource.com) " width="400" height="216" />
Figure 1: Countries with active nationwide offender DNA database programs (as of August, 2012). There are 44 countries worldwide, with over 40 million total profiles estimated to be uploaded. (Source: www.dnaresource.com)
The solution to addressing these concerns is for the world’s DNA community to begin to adopt large sets of loci that are shared throughout the world. Most recently, the FBI formed the CODIS Core Loci Working Group that has published a set of recommendations to expand the STR loci set used in the United States (and many other countries) to include the world’s most commonly used loci (Figure 2). These recommendations include 20 loci that are required and three loci that are highly recommended for inclusion by kit manufacturers in a single next generation multiplex.5,6
Figure 2: Loci most commonly used in global forensic DNA databases, including the current CODIS set in the United States, which is also used by many other countries, as well as the set now used most commonly in Europe, and the proposed expanded CODIS set. Loci marked with an “*” are not currently required in the CODIS “core loci” but are commonly analyzed and uploaded as part of the most widely used multiplexes and are designated as “required” in the proposed expanded CODIS set. Loci marked with “**” are “highly recommended” but not required for inclusion by manufacturers according to the new CODIS recommendations.
Technology Races to Meet the Needs of the Forensic DNA Community
The recent initiatives to expand international locus recommendations have crystallised the need for a new global multiplex system; however, the design of such a system, which must include all the necessary loci without sacrificing performance and processing capabilities, poses significant challenges.
The first challenge relates to size and spacing. The new CODIS recommendations specify 20 “required” markers and an additional three classed as “highly recommended” for inclusion in a single multiplex. To accommodate all 23 markers utilizing the current maximum of five available dye labels without significantly increasing the size of some of the amplicons would be impossible. Technology such as mobility modifiers, which adjust the mobility of a locus and relocate it into a higher size bracket, can be useful to optimize spacing between loci; however, due to limitations on the number of these molecules which can be employed before hurting the multiplex’s performance, these are not a viable option for significant repositioning of loci within the system.
The only way to achieve major relocation of loci with the restrictions imposed by a 5-dye system would be to redesign the primers to physically lengthen the amplicons. Changing the primer sequences in this way has a number of disadvantages, however, including a loss of concordance with data generated using the original primer sequences and the vulnerability of longer amplicons to the effects of degradation.
An added complication that puts pressure on the 5-dye model comes from the desire expressed by the forensic community to widen the marker ranges for established loci to better reflect the allele distributions revealed through the extensive databasing efforts of recent years. Wider marker ranges, when supported by an increase in the number of alleles within the associated allelic ladder, would enhance genotyping efficiency with fewer alleles being designated as “off-ladder.” However, maintaining sufficient space between adjacent markers to facilitate clear and unambiguous genotyping in the majority of cases becomes increasingly difficult as each range is extended.
The second challenge revolves around developing a system that meets the fundamentally different needs of single-source and casework evidence samples. Single-source samples, such as those collected for DNA databasing or as casework reference samples, are generally of a predictable nature in that they are collected on standardised substrates (usually paper or swabs), yield high quantity, high quality DNA, are generally processed close to the time of collection, and are collected from a single individual. Casework samples are in contrast often complex and unpredictable in nature, present on highly variable substrates, yield low quantity and/or low quality DNA, are aged or environmentally exposed, and can contain more than one contributor.
Added to this are the very different processing environments required for these types of samples. Single-source samples are generally received in higher numbers and as such, efficiency and amenability to automation of processing and analysis are important if laboratories are to maximize throughput while minimizing laboratory personnel requirements. For casework evidence samples, performance is paramount. The multiplex must be able to yield robust results for a wide range of inhibited and degraded samples.
To address these requirements and challenges, Life Technologies has developed a new, expanded STR multiplex configuration. To overcome the issue of size and spacing, the introduction of a novel, 6-dye chemistry enables all 23 of the loci listed in the CODIS recommendations to be incorporated into a single multiplex configuration. The availability of an additional dye channel together with the strategic use of mobility modifiers has enabled all 23 loci to be included with minimal adjustment to existing primer sequences, preserving amplicon length and maximizing concordance with existing datasets as much as possible (Figure 3).
Figure 3: Locus configuration of the GlobalFiler Kits. The colored bars indicate the mobility of the amplicons while the grey boxes indicate the physical amplicon length before the application of mobility modifiers for seven of the loci. For ease of interpretation, all three gender determination markers are located in the green dye channel. For performance on degraded samples, ten STR loci reside completely below 220 base pairs.
The only loci for which the core primer sequences have changed compared to current kits are TPOX and DYS391. In both cases, only the reverse primer has been re-engineered, repositioning the loci in the longer size range of the multiplex while minimizing any impact on concordance. The enlargement of the TPOX locus was considered an acceptable compromise given that this locus has been downgraded from a “required” to a “highly recommended” locus on the CODIS marker list. In addition, the relatively low discrimination power associated with this locus means that any failure to perform under degraded DNA conditions would have minimal impact on overall information recovery.
The DYS391 locus is included to provide gender determination redundancy for Amelogenin Y-null individuals. There may obviously be a concern that positioning this locus in the longer size range may result in drop-out under degraded DNA conditions. To compensate for this possibility, an additional Y-indel locus has been included below Amelogenin to ensure accurate gender assignment, even for degraded samples.
Over the next two years, as the United States and other countries around the world begin to adopt advanced STR multiplexes with expanded loci sets for forensic testing, a new era of DNA analysis will ensue. This should enable forensic analysts to achieve optimal results with all types of forensic samples and will allow more crimes to be solved, more quickly, with less adventitious database hits. Ultimately, this will enable the global law enforcement community to maximize the use of DNA as a crime fighting tool and continue to reaffirm DNA as the gold standard and most effective forensic tool available to law enforcement.
- P. Gill, et al., The evolution of DNA databases—recommendations for new European STR loci, Forensic Sci. Int. 156 (2006) 242–244.
- P. Gill, et al., New multiplexes for Europe—amendments and clarification of strategic development, Forensic Sci. Int. 163 (2006) 155–157.
- DNA Resource.com (Gordon Thomas Honeywell Web site)
- C. Davis, et al., Variants observed for STR locus SE33: A concordance study, Forensic Sci. Int. Genet. 6 (2012) 494–497.
- D.R. Hares, Expanding the CODIS core loci in the United States, Forensic Sci. Int. Genet. (2011), doi:10.1016/j.fsigen.2011.04.01
- D.R. Hares, Addendum to expanding the CODIS core loci in the United States, Forensic Sci. Int. Genet. (2012), doi:10.1016/j.fsigen.2012.01.003
Tim Schellberg is the President of Gordon Thomas Honeywell Governmental Affairs. He is a global expert on forensic DNA policy, law, and legislation.
Nicola Oldroyd is a Consultant Forensic Scientist at Life Technologies. She is an experienced forensic science professional with nearly 20 years in the forensic science industry. Life Technologies, 850 Lincoln Centre Drive, Foster City, CA, 94404
Lisa Lane Schade, MHR is the Director of Global Marketing for the Human Identification Business at Life Technologies. She is responsible for working with forensic DNA laboratories around the world to help define and execute the strategic direction for the Human Identification business.