By Julie Burger
For years, scientists have raced to unlock the mysteries hidden in the human genome. Relatively recent research methodologies, used in genome wide association studies (GWAS), are allowing scientists to much more rapidly uncover genes that may be linked to diseases. In GWAS, genetic and health information from thousands of people is compared to locate mutations or gene variants for diseases like breast cancer, diabetes and heart disease. (For a scientific viewpoint as to why this research theory might not be successful, click here.) Because this type of research calls for genetic and other information from thousands, or even tens of thousands of people, researchers are increasingly trying to tap into existing bodily tissue samples and private medical data from blood or biopsy samples taken at a physician’s office or hospital. Researchers are also asking to share information from other researchers’ studies. What many people don’t realize is that information about them and their genes could be taken and shared among researchers or even posted on the internet.
Researchers’ quest to correlate genes with disease has encouraged government agencies to implement policies to increase the sharing of genetic samples and genetic and health information even where the individual has not explicitly consented to the secondary use. The National Institutes of Health (NIH) has implemented a plan to increase access to genetic and associated health information. Starting in January 2008, researchers who received government funding for GWAS were required at the end of the study to submit the genetic profile and associated information about the health of the people whose tissue was used. The health information might be blood pressure or weight, or it might be information about drug use, mental health, and it could include information about family relationships.
This wasn’t a policy the NIH implemented lightly. Before implementing the data sharing plan, the NIH requested comments from the public and held a public meeting to consider the ethical and legal implication of the data sharing. It decided that the required data sharing and any subsequent research on the data did not constitute research on humans because names, address, and other “identifying information” would be removed from the dataset and because it was a secondary use of the information. (Researchers and institutions frequently refer to this as “anonymized” data or samples—they argue that if information that could be used to identify a person isn’t associated with the sample, the sample can’t be linked to a person.) This meant the NIH did not have to seek specific consent from the people who provided the genetic and other information. Rather, the NIH said that data should be shared if the sharing was “consistent with” the original consent of the people. Civil liberties groups and patient rights advocates objected to the lack of informed consent and raised privacy concerns. Commentators noted that even several years ago studies had already been published demonstrating that it is becoming increasingly easy to identify people by using less and less genetic information.
The newest research (published in August 2008) demonstrates that it is possible to link a person to specific genetic information about them even if the information is aggregated with other people or only reported at a summary (instead of individual) level. (This means that even if your DNA were mixed with samples from 1,000 other people, it might still be possible to identify it as yours.) The study authors conclude that it’s possible to tell whether a person or their relative participated in a genome wide association study for which data is available and to access the findings with respect to that individual.
Based on this research, fortunately, the NIH has reversed part of the policy–it has removed certain data from public access and it is requiring other agencies it has control over (such as the National Cancer Institute) to do the same. It still will offer to share this information with researchers or biotech companies for use in their research.
It is admirable that the NIH has taken the unusual step of reversing its policy. But some groups still have questions about how people's DNA is regarded and used. As one group wrote to the NIH, "it is questionable as to whether a database containing genotype information can ever be considered truly anonymous, or 'de-identified' when DNA itself is considered to be a 'unique identifier.'" Genetic information is always traceable to the person from whom it came. This new research demonstrating that even a genetic sequence without a name attached and in a group of sequences from other individuals can be identifiable was probably predictable to a degree. It might be time to acknowledge that technology is going to keep moving forward, and it will continue to become easier and easier to identify people through their DNA. Research shows it may be possible to predict the last name of the man a genetic sample came from by looking at the Y chromosome since men who are related share Y chromosome similarities and last names. The idea that DNA can be somehow "anonymized" by taking a person's name or address off it no longer holds true. The only certainty is that newer and better ways of linking up information will make it impossible to protect people's privacy, which is one reason, if we want to continue to use people's information, we should make sure to ask permission first.