Uncovering the Genetic Causes of Unexplained Rare Diseases
With over 10 years of experience working with genome sequencing data from patients with rare diseases, Tachyon’s Co-Founder and Chief Scientific Officer, Ernest Turro, PhD, also Associate Professor at Mount Sinai in New York, has just published a major paper in Nature Medicine identifying the genetic causes of previously unexplained rare diseases through a unique computational approach developed by his team at Mount Sinai.
Rare diseases affect 1 in 20 people but often go genetically undiagnosed. While over 10,000 rare diseases have been cataloged, fewer than half have been tied to a genetic cause. Major initiatives around the world are attempting to gather genetic and clinical data from sufficient numbers of patients to enable the discovery of the remaining unresolved rare diseases. While the cost of sequencing a human genome is now only a few hundred dollars (down from $1B 23 years ago), large studies nevertheless require significant logistical effort and coordination between many clinicians and researchers. Therefore, few large genetic datasets from rare disease patients currently exist. However, those that do exist are cumbersome and trialing to work with. Genomic data are typically stored in unmodifiable files weighing many terabytes in aggregate, and they need to be reconciled with data such as pedigree data, phenotypes and external data sources to allow statistical analyses that generate discoveries. The discovery process is therefore technically challenging.
"While rare diseases are individually rare, collectively they are quite common. It is important for our understanding of human biology and for the development of diagnostics and therapeutics that the remaining causes are found…"
Ernest Turro, Ph.D.
The research performed by Ernest’s team and his international collaborators led to the discovery of three new genetic causes of rare diseases. The team made use of a large genetic dataset from rare disease patient participants of the 100,000 Genomes Project (100KGP). This UK initiative included sequencing the genomes and collecting clinical data for 34,523 patients, and 43,016 unaffected relatives across 29,741 families. Working closely with Daniel Greene, PhD, a member of Ernest Turro’s research group, they developed a transformative computational framework, named the Rareservoir, that greatly simplifies the distillation of the key elements of rare genetic and phenotypic data from large datasets. Weighing only 5.5GB, the 100KGP Rareservoir allowed the team to identify 260 genetic associations with rare diseases using their previously developed statistical approaches. The great reduction in size of the data was made possible by taking advantage of the rarity of genetic variants responsible for genetic diseases. This rarity, preserved over time by natural selection (affected individuals tend to have fewer children), allowed the team to discard a large part of genetic information corresponding to common variants and ultimately only analyze 1% of the genetic data.
“We took advantage of the fact that the genetic variants responsible for rare diseases are typically kept rare in the human population by natural selection, because affected individuals tend to have few children, if any,” says Ernest Turro. “This meant that we could discard the genetic information corresponding to common variants in the human population without throwing away the key disease-causing variants.”
Ernest and his colleagues confirmed 241 associations between genes and rare disease classes in the 100KGP and identified 19 associations not present in the literature. Of the 19 novel associations, three candidates were prioritized for further validation. Working with international collaborators in Saudi Arabia, Japan, Belgium, the UK and the USA, the team validated all three associations:
- Firstly, they demonstrated that loss-of-function variants in the transcription factor-encoding gene ERG result in primary lymphoedema – a rare disease that manifests by tissue swelling due to an accumulation of fluids not adequately drained by the body’s lymphatic system.
- Secondly, they reported that truncating variants in the last exon of PMEPA1 result in a familial thoracic aortic aneurysm disease – which is a weakened area in the aorta that is at elevated risk of rupture, cutting off the supply of blood to the rest of the body.
- Thirdly, they observed that loss-of-function variants in the G-protein-coupled receptor-encoding gene GPR156 give rise to recessive congenital hearing impairment or deafness.
With these discoveries published, patients with any of these three rare diseases can obtain a genetic diagnosis for the first time and identify affected relatives. This is because discoveries such as those made by Ernest and colleagues are fed into diagnostic panels used internationally for interpreting genetic variants carried by patients. The remaining 16 associations are being explored to confirm which might be causal. Importantly, the Rareservoir will continue to be an invaluable tool to work with genetic data from rare disease patients in large collections as they continue to grow. For more depth, read the freely accessible published Nature Medicine paper here.
"Many people with a rare disease struggle for many years to obtain a genetic diagnosis. By developing and applying statistical methods and computational approaches to find new causes of rare diseases, we hope to expand knowledge of the underlying causes of these diseases, hasten the time to diagnosis for patients, and pave the way for the development of treatments."
Ernest Turro, Ph.D.