Study: Forensic STRs May Reveal Medical Information

  • <<
  • >>

590562.jpg

 

A core assumption in forensic DNA is that the loci used for identification do not reveal any medical information about the owner of the sample. In a legal context, this assumption is critical as current laws authorizing the collection of DNA from certain persons, such as arrestees and convicted felons, would immediately come into conflict with established state and federal health privacy laws and regulations.

In fact, the 20 short tandem repeats (STRs) known as the CODIS core loci were purposely selected because they are not associated with any known physical or medical characteristics. And while that has always been the conventional wisdom, a new study from researchers at San Francisco State University indicates that may not be true anymore.

Thirteen of the CODIS core loci were established by the FBI in 1998, with another seven STRs added in 2017. Thus, the majority of the loci were selected before the human genome was sequenced. As our knowledge of DNA and genetics has continued to grow thanks to that project and advancing technology, studies have questioned whether CODIS loci could impact medically relevant traits.

A 2013 review of phenotypic associations with genetic loci concluded there were no significant associations with the CODIS STRs; however, the authors did report that some CODIS loci fall within predicted sites for genomic regulation, and all CODIS loci are within 1 kb of at least one genetic variant associated with a phenotype. Then, a 2020 review identified 84 significant published associations between traits and STRs for 18 of the 20 CODIS loci.

In the new study, published last week in PNAS, researchers investigated whether genotypes at the CODIS loci could directly reveal information about the expression levels of neighboring genes. To do this, they examined STR length variation and gene-expression values from lymphoblastoid cell lines from 421 individuals in the 1,000 Genomes Project. Since the 1,000 Genomes Project relied on short-read sequencing, the team had to impute STR genotypes based on the linkage disequilibrium between STRs and the surrounding SNPs.

According to the study, for each CODIS STR–gene pair, the researchers tested for correlation between CODIS loci genotypes and the expression levels of neighboring genes. Of the 39 CODIS STR–gene pairs tested, 6 showed significant correlations. The team classified three of those as “striking”—CSF1R, LARS2, and KDSR.

CSF1R expression has a significant negative correlation with the genotype of intronic CODIS locus CSF1PO. CSF1R encodes a cytokine receptor that plays a key role in microglial regulation, and variation in the expression and splicing of gene are associated with psychiatric conditions, including depression and schizophrenia. (CSF1R has also been linked to neural conditions like epilepsy, Alzheimer’s disease and spinal cord injury recovery, although the forensic relevancy there is limited).

LARS2 is well-established as an essential gene, as mutations that reduce or knock out its function have been associated with Perrault syndrome, MELAS syndrome and other conditions. In the PNAS study by Banuelos et. al., the researchers observed a significant negative correlation between D3S1358 allele length and LARS2 expression levels.

“There is strong evidence that D3S1358 is in linkage disequilibrium with both a variant that putatively impacts LARS2 expression and DNase I hypersensitivity sites active in lymphoblasts,” the team explains in the paper.

Meanwhile, KDSR—which is near D18S51—encodes an enzyme involved in synthesis of the lipid ceramide. Mutations in KDSR that eliminate or decrease enzyme function have been associated with a number of severe skin and platelet conditions.

“The fact that dramatically reduced function leads to severe phenotypes raises the question of whether marginally lowered expression may lead to intermediate conditions. The association between CODISeSTRs and those genes’ expression means that the CODIS genotype may be informative about risk of those conditions or other intermediate phenotypes,” write Banuelos and co-authors.

As with all studies, there are limitations, including the use of imputed STR genotypes for the 1,000 Genomes Project data, as well as the heavily European slant of the population data utilized from that project.  Still, even “limited in scope and underpowered,” the research team says their analysis produced significant results and they question if stronger correlations would be identified with a larger, more representative sample and direct STR genotyping.

“These results join a growing body of work showing that CODIS genotypes may contain more information than purely identity. CODIS profiles have been found to provide information about the surrounding haplotype, as well as genetic ancestry. Together, these findings raise concerns about the medical privacy of individuals whose CODIS profiles are seized, databased, and accessed, as well as the genetic relatives of those persons,” conclude the study authors.

 

Subscribe to our e-Newsletters
Stay up to date with the latest news, articles, and products for the lab. Plus, get special offers from Forensic – all delivered right to your inbox! Sign up now!

Related Product Reviews