Photo courtesy of NIST​CODIS, the FBI’s Combined DNA Index System, has collected millions of genetic profiles of criminals, unknown offenders, missing persons, and unidentified human remains in a single database since the 1990s. The pattern of an individual's DNA was believed to only provide unique individual identification.

But a new study indicates that a lot more is contained within CODIS than was originally believed. Ancestry information and potential phenotyping data is also in there, amid the billions upon billions of DNA variables.

The scientists found ancestry information is inherent in the current 13 loci used in CODIS, according to the paper published last week in the journal Current Biology.

“Everyone knows these markers are great for ‘fingerprinting’ people – we just didn’t know how much ancestry information came along with it,” Jun Li, a professor in the Department of Human Genetics at Michigan, and one of the authors, told Forensic Magazine.

CODIS has long been considered a “pure” source of genetic identification, without other data such as skin or hair color. But those markers on file also hold information showing whether a person is likely of European, African, Asian, or other descent, Li said.

The authors - including Noah Rosenberg, Bridget Algee-Hewitt, Michael Edge and Jaehee Kim from Stanford University - analyzed 978 people sampled from 53 different populations across the world. That included 792 markers overall.

They found that despite the selection of the individual-identification-focused marrkers, population information is still embedded.

“That the CODIS loci possess similar ancestry information as non-CODIS sets is surprising given arguments from forensic genetics, which have claimed that loci selected for heterozygosity and individual identification encode little ancestry information, and because forensic loci are selected in this manner, they are particular ancestry-uninformative,” the authors write.

The FBI’s national press office did not respond to request for comment.

READ MORE: Circadian Rhythm Could Help Determine Post-Mortem Interval

CODIS was originally started as a pilot software project in 1990. But what started with 14 state and local laboratories ballooned with the DNA Identification Act of 1994. Currently, more than 190 public law enforcement agencies take part in the database within the U.S., and 90 labs in 50 countries abroad also use and contribute to it.

The amount of data on file has skyrocketed since the turn of the century. What was once under a half million profiles now includes nearly 12 million offenders, 2 million arrestees, and other profiles – and it grows every day.

The FBI is mandating that labs increase the number of loci they use for identification from 13 to 20 early next year. Increasing the number of loci could also increase the ancestry information, Li said.