New File Type Improves Genomic Data Sharing While Maintaining Participant Privacy

For Immediate Release
Wednesday, October 17, 2018
9:45 a.m. U.S. Pacific Time

Media Contact:
Ann Klinck

Proposed Format Presented at ASHG 2018 Annual Meeting

Gamze Gursoy, PhD, Yale University
(courtesy Dr. Gursoy)

SAN DIEGO, Calif. – Based on an analysis of data leakages and opportunities to prevent the potential misuse of genetic information, researchers have developed a new file format for functional genomics data that enables data sharing while protecting the personal information of research participants. The findings were presented at the American Society of Human Genetics (ASHG) 2018 Annual Meeting in San Diego, Calif.

Functional genomics is the study of how the genome functions in the body, such as how genes are regulated, are expressed into proteins, and interact with proteins to affect cellular functions in disease and health. Gamze Gursoy, PhD, postdoctoral research associate at the Yale University Computational Biology and Bioinformatics Program, and her colleagues set out to identify weaknesses in current functional genomics data files and processes and to find practical fixes.

“As functional genomics technology is still emerging, the data resulting from this research has not been well-studied by privacy researchers,” said Dr. Gursoy. Previous analyses have shown that in certain cases, it is possible to trace de-identified functional genomics data back to the individual participant, a concept known as data leakage. Through a series of tests in the past few years, Dr. Gursoy and her colleagues measured the amount of variant information leaked in gene expression and functional genomics experiments involving different data types, and the extent to which this information could be mapped back to individuals.

“Just like genetic data, this data comes from real individuals, and we wanted to raise awareness that there could be leakages. At the same time, we want to democratize access to data and avoid bureaucratic hurdles,” she said. To accomplish this goal, the researchers developed ways to measure leakage from raw functional genomics data and a file format to reduce the leakage in a targeted way.

Notably, the format they developed is easily layered onto genetic data file types already in common use, such as sequence alignment mapping and binary alignment mapping. Dr. Gursoy hopes its ease of use encourages more researchers to make their findings available through the proper channels.

“We want to balance participant privacy with flow of scientific information,” said Dr. Gursoy. “If researchers restrict their data completely, scientific discovery stops.”

Dr. Gursoy is now working with existing data repositories, such as ENCODE. She emphasized that privacy protection is a continuous effort that does not stop with this one file format; it’s also about educating the public.

“Genomic privacy is very unique,” said Dr. Gursoy. “Genetic data can be used to link people to their disease status in certain databases. While there are laws in place like the Genetic Information Nondiscrimination Act, people are unaware that insurance companies cannot use your genetic information to refuse coverage.”

Dr. Gursoy hopes that this file type will be adopted more widely, leading to more collaboration in the field and fewer hurdles to reproducing research. She continues to work on methods to provide research data in a timely manner while keeping information secure.

This work is funded by the National Institutes of Health’s Big Data to Knowledge Program.


Dr. Gursoy will present this research on Wednesday, October 17, 2018, from 9:45-10:00 a.m., in Ballroom 6C, Upper Level, San Diego Convention Center.

Press Availability:

Dr. Gursoy will be available to discuss this research with interested media on Wednesday, October 17, 2018, from 12-12:45 p.m. in the ASHG 2018 Press Office (Room 22).


Gursoy G et al. (2018 Oct 17). Abstract: Private information leakage from raw functional genomics data: Theoretical quantifications & practical privacy-aware file formats. Presented at the American Society of Human Genetics 2018 Annual Meeting. San Diego, California.

About the American Society of Human Genetics (ASHG)

Founded in 1948, the American Society of Human Genetics is the primary professional membership organization for human genetics specialists worldwide. Its nearly 8,000 members include researchers, academicians, clinicians, laboratory practice professionals, genetic counselors, nurses, and others with an interest in human genetics. The Society serves scientists, health professionals, and the public by providing forums to: (1) share research results through the ASHG Annual Meeting and in The American Journal of Human Genetics; (2) advance genetic research by advocating for research support; (3) educate current and future genetics professionals, health care providers, advocates, policymakers, educators, students, and the public about all aspects of human genetics; and (4) promote genetic services and support responsible social and scientific policies. For more information, visit:

6120 Executive Blvd, Suite 500 | Rockville, MD 20852 | 301.634.7300 | |
Connect with ASHG on Twitter (@GeneticsSociety) | Facebook | LinkedIn

ASHG uses cookies to provide you with a secure and custom web experience. Privacy Policy