Study questions the medical privacy of forensic samples
SF State researchers say databases used by law enforcement could contain private information about individuals, including crime victims
Watch any episode of “CSI,” and a character will use forensic DNA profiling to identify a criminal. A new study from San Francisco State University suggests that these forensic profiles may indirectly reveal medical information — perhaps even those of crime victims — contrary to what the legal field has believed for nearly 30 years. The findings could have ethical and legal implications.
“The central assumption when choosing those [forensic] markers was that there wouldn’t be any information about the individuals whatsoever aside from identification. Our paper challenges that assumption,” said first author Mayra Bañuelos (B.S., ’19), who started working on the project as a San Francisco State undergraduate and is now a Ph.D. student at Brown University.
Law enforcement uses the Combined DNA Index System (CODIS), a system organizing criminal justice DNA databases that uses specific genetic markers to identify individuals. Crime labs from national, state and local levels contribute to these databases and provide profiles from samples collected from crime scene evidence, convicted offenders, felony arrestees, missing persons and more. Law officials can use the database to try to match samples found in an investigation to profiles already stored in the database.
CODIS profiles consist of an individual’s genetic variants as a set of short tandem repeats (STRs), sequences of DNA that repeat at various frequencies among individuals. Since the ’90s, 20 STRs have been chosen for forensic CODIS profiling specifically because it was believed they did not relay medical information. If these profiles contained any trait information, then there could be issues about medical privacy.
“But that assumption hasn’t had much investigation in a long time, and we know a lot more about the genome now than we did back then,” explained SF State Associate Professor of Biology Rori Rohlfs, who led this project.
The assumption that only criminals are sampled is also not completely accurate. “It actually also includes victims of crime and people that may have been at crime scenes. You have these huge databases including a lot of people that are not necessarily criminals,” Bañuelos said. “I believe also that accessibility to these databases varies a lot according to a jurisdiction.”
The researchers explained that other papers have found associations between other (non-CODIS) STRs and disease or gene expression. With that in mind, the SF State team wanted to understand the relationship between the CODIS STR markers and gene expression.
Rohfls’ lab used publicly available data (1000 Genome Project) and genetic models to investigate the relationship between CODIS markers and gene expression. Of the 20 CODIS markers, they found six associations between CODIS markers and gene expression of nearby genes in white blood cell lines from more than 400 unrelated individuals in the database.
“In some genes, gene expression change has been associated with medical conditions,” Bañuelos explained, citing prior research. “[In this study,] we indirectly know there is an association between these CODIS genotypes and some change in genes that can lead to illness.”
The authors note three associations to genes (CSF1R, LARS2, KDSR) that were particularly interesting. Prior literature shows that mutations and changes in gene expression of CSF1R can be tied to psychiatric conditions (depression and schizophrenia). Mutations and gene expression changes in the other genes have been connected to Perrault syndrome, MELAS syndrome, severe skin and platelet conditions and more, the scientists note in the PNAS (Proceedings of the National Academy of Sciences) paper. If CODIS markers can be connected to the expression of genes linked to disease and health, then it means that the data in the CODIS database could compromise an individual’s medical privacy.
“Our paper in some ways is like the tip of the iceberg,” Rohlfs said, admitting that she was surprised to find associations in a relatively small sample size. The project itself simply started as an undergraduate exploration project. Eight of the 11 authors were, like Bañuelos, undergraduates at SF State when the project began.
“It raises the question: If we did a more expansive [genetic] study, would we find even more information that would be revealed by CODIS profiles?” Rohlfs asked.
Bañuelos and Rohlfs are curious to know what they’d find if they looked at a larger dataset of more diverse populations — their current dataset is predominantly European. Their analysis was also limited to white blood cells. What relationships would they find if they looked in other tissues?
These are important lines of inquiry because the current dataset doesn’t represent the general population. Furthermore, Latino and African American communities are overrepresented in these CODIS databases, Bañuelos explained.
Additional studies are needed to better flush out the relationship between CODIS and medical information. However, the researchers point out that if CODIS profiles contain medical information, there could be major implications.
“If [these CODIS profiles] contain medical information, then their treatment would need to be consistent with the way we protect medical information in the United States. We would have to have policies that regulate the seizure, storage and sharing of these profiles,” Rohlfs added.