Advertisement

The future of forensic genomics is here. Next-generation sequencing (NGS) is “now generation” and is transforming the capabilities of human identification laboratories. Massively parallel sequencing (MPS) systems enable simultaneous analysis of forensically relevant genetic markers to improve efficiency, capacity, and resolution, bringing modern science in place of outmoded DNA fingerprinting.

Background and Limitations of CE-Based DNA Typing

Genotyping of short tandem repeat (STR) loci based on length-based amplicon detection has been used in forensic biology for over 20 years.1-5 This method uses amplification by polymerase chain reaction (PCR) of STRs, separation by capillary electrophoresis (CE) and allele typing using fluorescent fragment sizing.6 Genotyping of short tandem repeats (STRs) on autosomal, X, and Y chromosomes, and different classes of single nucleotide polymorphisms (SNPs), may be performed in separate CE assays and workflows,7 with commercially available solutions or “home brew” approaches.

While CE-based assays served the forensic community well for the past two decades, they also bear limitations in throughput, scalability, and allelic resolution. For example, CE limits the total number and classes of loci that can be multiplexed together. This is due in part to the restriction on the number of loci that can be amplified and detected simultaneously to avoid size overlap of fluorescently tagged amplicons. The maximum number to date is limited to ~24 loci using CE.8-9 Second, CE-based assays don’t accommodate sample multiplexing and, therefore, samples must be processed individually. Third, CE-based analysis with length-based typing cannot capture sequence differences present in alleles of the same length. This impedes mixture detection and resolution. Fourth, CE-based typing may not provide complete results on forensic DNA casework samples containing low amounts of template DNA that are either degraded, inhibited, or both. Thus, decisions may be required to determine which assay/workflow might be most effective with these limited, degraded samples a priori. Iterative testing utilizes more input DNA with each kit or method attempted, in search of a “full profile.” Sixth, in the event of exclusionary results or no DNA database hit, CE-based STR DNA results do not provide investigative lead information, and result in a genetic dead end.

 Advantages of Next-Generation Sequencing

Next-generation sequencing, also known as massively parallel sequencing, has been around for over a decade.10 Major advances in NGS now provide low cost, large-volume sequencing capabilities that are actively applied to various questions in areas such as evolutionary biology,11 oncology,12 microbial genomics,13 agrigenomics,14 and disease genomics15 worldwide.

These advances enable NGS to address limitations of capillary electrophoresis in forensic testing.16-17 NGS-based forensic typing18, 30 based on sequencing by synthesis (SBS) technology uses fluorescent, reversible terminator chemistry.19 One advantage of SBS, with light-based detection, is that it minimizes incorporation bias, nearly eliminating errors and missed base calls associated with homopolymeric regions and repetitive DNA elements.20The first fully-validated, complete forensic NGS system18, 36-38 has many additional advantages, including the ability to amplify and sequence 1) amelogenin, 27 autosomal STRs, 24 Y-STRs, 7 X-STRs, 94 identity- informative single nucleotide polymorphisms (iiSNPs) using DNA primer mix A (DPMA),21-24, 30 and 2) flexibility to choose DNA primer mix B (DPMB), that targets each of the loci in DPMA, as well as biogeographical ancestry informative SNPs (aiSNPs)25 and phenotype informative SNPs (piSNPs) for hair and eye color estimation,26-27 by incorporating an additional 78 SNPs.

Software was developed for the system, in collaboration with forensic DNA experts around the world, and it provides investigative lead tools for probabilistic estimation of biogeographical ancestry, hair, and eye color38. Biogeographical ancestry is estimated using a refined set of 56 aiSNPs for major population group association.28 The biogeographical ancestry within major population groups is displayed as a principal component analysis (PCA) plot.29 Eye and hair colors exhibit across a colors gradient. The terminal poles of these colors are absolutes where scores of 100% may be seen for extremes of blue eyes and brown/black eyes, with other colors in intermediate range creating a spectrum. Rapid, sensitive, highly resolved allele calling of up to 231 loci from a single sample aliquot is facilitated if desired.35-38 Alternatively, DPMA targets identity markers only.

In addition to the ability to multiplex ~10-fold more loci than current CE-based systems, other advantages include:

  • Sample-specific barcoding for multiplexing up to 96 samples per analysis using 1 ng or less of template DNA.The combination of sample and locus multiplexing can replace a minimum of 6 separate CE-based kits.
  • Sequence variant detection reveals additional STR alleles of the same size that are not detected by CE.32-33 These “isometric heterozygotes,” or isoalleles, can be visually displayed in the software.38  This nucleotide-based genotyping can tease out the number of contributors to aid mixture resolution. Sequence differences can distinguish an allele of a minor contributor and the stutter of a major contributor, even if they are of the same amplicon length. In CE, these data appear as one peak.  In some mixtures, CE data indicate a single source sample or a two-person mixture, whereas NGS detects three or more DNA contributors.34 Genetic data from mixed sample studies, with major:minor ratios from 99.9:0.1% to 50:50%,  indicate the ability to detect shared and unshared (unique, obligate) minor contributor alleles at less than 5% of the major donor.35
  • Enhanced results on degraded and inhibited DNA samples, as the targets of the majority of the loci are <200 bp, generated in a single reaction.35
  • Compatibility with worldwide STR DNA databases created with CE methods, and
  • Ability to investigate familial relationships and personal identification using X and Y STRs, without iterative testing. This reduces consumption of DNA samples and the need for deciding which assay to implement, as all classes of forensically significant loci are amplified in one forensic NGS multiplex.35 

Forensic scientists over the globe are actively investigating and implementing NGS systems for forensic genomics due to their ability to improve DNA typing.18, 30, 39 By delivering targeted data on forensically significant loci, forensic scientists can answer a wider range of questions in a single assay.

Developmental Validation

The first complete, targeted forensic NGS system18, 30, 39 was subjected to Developmental Validation35 according to the 2012 Revised SWGDAM Validation Guidelines.40 Approximately 1736 PCR 1 reactions were run and analyzed.35  Studies included DNA sample collection substrate testing (cotton swabs, FTA cards, filter paper), species cross reactivity testing from a range of non-human organisms, DNA input sensitivity studies from 1 ng down to 7.8 pg, two-person human DNA mixture testing with three genotype combinations, stability analysis using highly degraded DNA templates, 23 mock case type samples, and effects of five commonly encountered PCR inhibitors.

Results from four validation guideline sections35 are summarized below.

Accuracy and Reproducibility

Accuracy and precision statistics are shown for autosomal STRs, X-STRs, Y-STRs, iiSNPs, and for ai,piSNPs, targeted by DNA Primer Mixes A and B, respectively (Table 1). Accuracy was determined by concordance with orthogonal CE, bead array and/or whole-genome sequencing methods to the NGS system. Precision was calculated by determining the most frequently observed genotype or haplotype at each locus, and totaling the percentage that genotype was observed over all replicates. Calculations from STR and SNP repeatability and reproducibility studies (1 ng template) indicate 100.0% accuracy in allele calling relative to CE for STRs (n=1260 samples), and 99.5% accuracy relative to bead array typing for SNPs (n=1260 samples for iiSNPs, 310 samples for aiSNPs and piSNPs), with >99.0% and >97.8% precision, respectively. Call rates of >99.0% were observed for all STRs and SNPs with both DNA primer mixes (DPMA, DPMB).

Varied numbers of DNA libraries per run were assessed. Data suggest a maximum of 96 samples for single source DNA samples, using DPMA, and a maximum of 32 casework type samples that could contain a DNA mixture, using DPMB, using the default software analytical and interpretation thresholds on read counts (modifiable per locus in routine laboratory use). Laboratories may determine the desired number of reads and analytical and interpretation thresholds, based on internal validation or policy per area of inquiry.38

Sensitivity

Sensitivity studies were conducted using serial dilutions of the 2800M positive control using DPMB (231 loci) to evaluate the ability to generate reliable genotypes and haplotypes at various gDNA template amounts. The 2800M gDNA sample was serially diluted; the input DNA amounts of 1 ng, 500 pg, 250 pg, 125 pg, 62.5 pg, 31.25 pg, 15.625 pg, and 7.82 pg were amplified in quadruplicate.  Complete, accurate, and reproducible genotypes from input DNA ranging from 62.5 pg to 1 ng in sensitivity studies (Figure 1), and partial profiles at decreasing template amounts were observed, using default software settings.35

Stability

Recovery of actionable genetic information from partially degraded DNA samples is possible as >50% of the 231 targeted amplicons are less than 205 nt in length.36 Partial profiles from template input as low as 7 pg produced random match probabilities as rare as, or rarer than, those from the Combined DNA Index System STR core loci.

Mock Case Samples

23 mock case type samples were analyzed by CE and NGS. Case type sample analyses, possibly the penultimate test of a forensic typing system, verified 100% concordance of STR and SNP genotypes and results generated by SNP genotyping arrays and conventional capillary electrophoresis.

Summary

Advantages of NGS provide the forensic DNA community a revolutionary tool for aiding investigations, including increased information potential, scalability, and allelic resolution.  Developmental validation data from a targeted forensic NGS system35 have met forensic DNA quality assurance guidelines with robust, reliable, and reproducible performance on samples of various quantities and qualities.  

References

1.         Holland MM, Fisher DL, Lee DA, et al. Short tandem repeat loci: application to forensic and human remains identification. EXS. 1993;67:267-274.

2.         Frégeau CJ, Fourney RM. DNA typing with fluorescently tagged short tandem repeats: a sensitive and accurate approach to human identification.  Biotechniques. 1993;15(1):100-119.

3.         Kimpton CP, Gill P, Walton A, et al. Automated DNA profiling employing multiplex amplification of short tandem repeat loci. PCR Methods Appl. 1993;3(1):13-22.

4.         Hammond HA, Jin L, Zhong Y, et al. Evaluation of 13 short tandem repeat loci for use in personal identification applications. Am J Hum Genet. 1994;55(1):175-189.

5.         Gill P, Kimpton C, D'Aloja E, et al. Report of the European DNA profiling group (EDNAP)--towards standardisation of short tandem repeat (STR) loci. Forensic Sci Int. 1994;65(1):51-59.

6.         Lazaruk K, Walsh PS, Oaks F, et al. Genotyping of forensic short tandem repeat (STR) systems based on sizing precision in a capillary electrophoresis instrument. Electrophoresis. 1998;19(1):86-93.

7.         Jobling M, Gill P. Encoded evidence: DNA in forensic analysis.

Nature Genetics. 2004;5(10):739-752.

8.         Wang DY, Gopinath S, Lagacé RE, et al. Developmental validation of the GlobalFiler® Express PCR Amplification Kit: A 6-dye multiplex assay for the direct amplification of reference samples. Forensic Sci Int Genet. 2015;19:148-155.

9.         Oostdik K, Lenz J, Nye K, et al. Developmental validation of the PowerPlex® Fusion System for analysis of casework and reference samples: A 24-locus multiplex for new database standards. Forensic Sci Int Genet. 2014;12:69-76.

10.       van Dijk EL, Auger H, Jaszczyszyn Y, et al. Ten years of next-generation sequencing technology. Trends Genet. 2014;30(9):418-426.

11.       Berglund EC, Kiialainen A, Syva¨nen C. Next-generation sequencing technologies and applications for human genetic history and forensics. Invest Genet. 2011;(24)2:23-37.

12.       Kothari N, Schell MJ, Teer JK, et al. Comparison of KRAS mutation analysis of colorectal cancer samples by standard testing and next-generation sequencing.  Clin Pathol. 2014;67(9):764-767.

13.       Budowle B, Connell ND, Bielecka-Oder A, et al. Validation of high throughput sequencing and microbial forensics applications. Investig Genet. 2014;5:9.

14.       Kawahara Y, de la Bastide M, Hamilton JP, et al. Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice (N Y). 2013;6(1):4.

15.       Pihlstrøm L, Rengmark A, Bjørnarå KA, et al. Effective variant detection by targeted deep sequencing of DNA pools: an example from Parkinson’s disease. Ann Hum Genet. 2014;78(3):243-252.

16.       Zascavage RR, Shewale SJ, Planz JV. Deep sequencing technologies and potential applications in forensic DNA testing. Forensic Sci Rev. 2013;25(1-2):79-105.

17.       Børsting C, Morling N. Next generation sequencing and its applications in forensic genetics. Forensic Sci Int Genet. 2015;18:78-89.

18.       Caratti S, Turrina S, Ferrian M, et al. MiSeq FGx sequencing system: A new platform for forensic genetics. Forensic Science International: Genetics Supplement Series. 2015(5):e98-e100.

19.       Bentley Dr, Balasubramanian S, Swerdlow HP, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456(7218):53-59.

20. Bragg LM, Stone G, Butler MK, et al. Shining a Light on Dark Sequencing: Characterising Errors in Ion Torrent PGM Data. PLoS Comput Biol. 2013;9(4):e1003031.

21. Sanchez JJ, Phillips C, Børsting C, et al., A multiplex assay with 52 single nucleotide polymorphisms for human identification. Electrophoresis. 2006;27(9):1713-1724.

22. Dixon LA, Murray CM, Archer EJ, et al. Validation of a 21-locus autosomal SNP multiplex for forensic identification purposes.  Forensic Sci Int. 2005;154(1):62-77.

23. Pakstis AJ, Speed WC, Fang R, et al. SNPs for a universal individual identification panel. Hum Genet. 2010;127(3):315-324.

24. Kidd KK, Kidd JR, Speed WC, et al. Expanding data and resources for forensic use of SNPs in individual identification. Forensic Sci Int Genet. 2012;6(5):646-652.

25. Kidd KK, Speed WC, Pakstis AJ, et al. Progress toward an efficient panel of SNPs for ancestry inference. Forensic Sci Int Genet. 2014;10:23-32.

26. Branicki W, Liu F, van Duijn K, et al. Model-based prediction of human hair color using DNA variants. Hum Genet. 2011;129(4):443-454.

27. Walsh S, Liu F, Ballantyne K, et al. IrisPlex: a sensitive DNA tool for accurate prediction of blue and brown eye colour in the absence of ancestry information. Forensic Sci Int Genet. 2011;5(3):170-180.

28.  Kidd KK, Kidd JR, Pakstis AJ, et al. Better SNPs for Better Forensics: Ancestry, Phenotype, and Family Identification. 2012. NIJ Poster: http://medicine.yale.edu/lab/kidd/publications/NIJposter2012_Minihaps_237328_174718_29491.pdf

29. Phillips, C. Forensic genetic analysis of bio-geographical ancestry. Forensic Sci Int Genet. 2015;18:49-65.

30. Churchill JD, Schmedes SE, King JL, et al. Evaluation of the Illumina® Beta Version ForenSeq™ DNA Signature Prep Kit for use in genetic profiling. Forensic Sci Int Genet. 2016;20:20-29.

31. Davis C, Warhauser DH, Budowle B. DNA Profiling of Database Reference Samples Using Second Generation Sequencing. 23rd International Symposium on Human identification. Oral Presentation, October 2012.

32. Gettings KB, Aponte RA, Vallone PM, et al. STR allele sequence variation: current knowledge and future issues. Forensic Sci Int Genet. 2015;18:118-130.

33. Gettings KB, Aponte RA, Kiesler KM, et al. The next dimension in STR sequencing: polymorphisms in flanking regions and their allelic associations. Forensic Sci Int Genet. 2015;Suppl. Ser. 5: e121-e123.

http://www.sciencedirect.com/science/article/pii/S1875176815301219

34. Dr. David Ballard, Kings College personal communication, (2015).

35. Jager A, et al. Developmental Validation of the MiSeq FGx Forensic Genomics System for Targeted Next Generation Sequencing in Forensic DNA Casework and Database Laboratories. Submitted for publication, 2016.

36. Illumina Inc., ForenSeq TM DNA Signature Prep Reference Guide, 2015.

37. Illumina Inc., MiSeq FGxTM Instrument Reference Guide, 2015.

38. Illumina Inc., ForenSeqTM Universal Analysis Software User Guide, 2015.

39. Hussing C, Børsting C, Mogensen HS, et al. Testing of the Illumina® ForenSeq™ kit, Forensic Science International: Genetics Supplement Series, Volume 5, December 2015, Pages e449-e450.

40. Scientific Working Group on DNA Analysis Methods Validation Guidelines for DNA Analysis Methods. 2004;1-13.

Steven B. Lee, PhD, holds a Molecular Biology PhD from UC Berkeley, served as Director of R&D at the CA DOJ DNA Lab and Hitachi Genetics Systems, and on the FBI Technical Working Group on DNA Analysis Methods.

Joe Varlaro holds a Cell and Developmental BS from University of Rochester, served as Scientist at Cetus and Roche Molecular Systems, and as a Senior Criminalist and DNA Technical Leader at the Boston PD Crime Laboratory.

Cydne L. Holt, PhD holds a Cell and Molecular Biology PhD from Baylor College of Medicine, served as Director of the Forensic Services Division of the SFPD Crime Laboratory, and as a Criminalist at the Santa Clara County Crime Laboratory.

Advertisement
Advertisement