January 2019 – All The Coolest Genomics

Updates to the human germline editing saga

News that various US based academics knew about He’s plans. The New York Times outlines that various American academics knew what He was planning, but kept silent, mostly because they didn’t know what to do about research happening in China, and/or because they thought they had dissuaded He from discontinuing. Meanwhile, STAT reports that Deem, He’s US based thesis advisor was listed as the last author of the manuscript He submitted to Nature (which declined to review it). The extent of his involvement is unclear. And the Associated Press broke the news that He had informed Nobel winning geneticist Mello about the pregnancy in April.
The NASAM report, that I covered here, stopped short of calling for a moratorium. He seemed to think that he was not contravening it. This has lead to renewed calls for a clearer stance. Debate continues over whether a moratorium is the correct response. The New York Times editorial team issued a statement against a moratorium, and in favor of diversifying the deciders and engaging the public. “It may be impossible to prevent truly rogue actors, but it is possible to slow them down without stopping everyone else.” Jennifer Doudna opposes a moratorium, Feng Zhang supports one.
A larger question looms about the relevant role of scientists in this debate. In an opinion piece in STAT, Are scientists’ reactions to ‘CRISPR babies’ about ethics or self-governance?, the authors make a strong case for the latter: “We believe that the alarm being sounded by the scientific community isn’t really about ethics. It’s about protecting a particular form of scientific self-governance, which the “ethics” discourse supports.” “Scientists articulated more concern about maintaining their authority to unilaterally transform human biology than a willingness to have a public debate about the ethics of whether — and under what conditions — such transformation should take place.” A Hastings Report article outlines the differences in opinion being expressed by scientists about the relevant roles of scientists and society in the future of the technology.

Controversy

In a long piece in the New York Times Magazine on paleogenomics, takehomes here, Gideon Lewis-Kraus points to the field’s major and recent successes, but also highlights how some archeologists fear that it traffics in “grand intellectual narratives” that history warns us against. This is an indication of culture wars between geneticists and others.

Science

A study that doubled the number of microbial genomes available (150,000) by producing data from metagenomic studies including those covering non-Westernized countries. They grouped their ~150,000 new sequences in to 5000 “species bins”, 77% of which were novel. The new sequences dramatically increase the mappability of samples to 87%.
A fine grained (682bp) map of where crossovers occur on chromosomes, from Decode. The CEO of Decode, and author, Dr Stefansson (source): “The classic premise of evolution is that it is powered first by random genetic change. But we see here in great detail how this process is in fact systematically regulated – by the genome itself and by the fact that recombination and de novo mutation are linked. We have identified 35 sequence variants affecting recombination rate and location, and show that de novo mutations are more than fifty times more likely at recombination sites than elsewhere in the genome. Furthermore, women contribute far more to recombination and men to de novo mutation, and it is the latter that comprise a major source of rare diseases of childhood. What we see here is that the genome is an engine for generating diversity within certain bounds. This is clearly beneficial to the success of our species but at great cost to some individuals with rare diseases, which are therefore a collective responsibility we must strive to address”
Polygenic score for lifespan, explaining 1% of the phenotypic variance, which is 5% of the heritability. Those in the top 10% of the score can expect, on average, to live 5 years longer than those in the bottom 10%.
GWAS for risk tolerance and risky behaviors, with genetic overlaps found between different “risky” phenotypes, and with various personality traits.

Applications

Following the successful completion of Genomics England’s 100,000 genomes project, healthy patients will soon be able to pay for genetic tests through the NHS. The NHS has, up until this point, been free at the point of service. Fears of a ‘two tier system” are weighed against the idea that “Every genome sequenced moves us a step closer to unlocking life-saving treatments.”
Ongoing issues with patients getting access to their own data from testing companies.
Data sharing issues are also a hot topic in the research space, see this opinion piece that all publicly available data should be completely open access. Advances will be made each easier with improvements to tracking the ways in which data can be used in a fixed format. That is the purpose of the GA4GH’s Data Use Ontology, now open for comment.
Identical twins tried out various ancestry kits and received somewhat different results.
A silicon valley startup offering tumor genetic testing and personalized therapy recommendations — for dogs. The idea is that the dogs could be treated with therapies for humans, under something akin to compassionate use programs.
An editorial in Nature calls on the WHO to integrate genomic testing of cholera to help fight outbreaks. This new-ish field is called genomic phylogeography.

Regulation

More powers for DNA forensics. As of the beginning of January, the Rapid DNA Act comes into force. Rapid DNA machines sitting inside police stations allow police officers to obtain sequence results in 90 minutes. The Act allows for police to upload this data to CODIS, the National DNA database, and look for matches to e.g. previous crime scenes.

In other news: The preprint server for biology BioRxiv, just turned five. In 2018, about 1711 preprints were posted per month, and in October there were over 1 million downloads. A project called the Rxvist allows users to see which preprints are generating the most twitter attention. I have added this to my bookmarks, and will be using it to help inform this round-up from henceforth!

Focus on polygenic risk scores

First, let’s look at a paradigmatic example of a polygenic score publication. Inouye et al constructed a Polygenic Risk Score for CAD, gaining a hazard ratio of 4.17 for those in the top 20% compared to bottom 20% of their score. It is better as a single predictor (based on Area Under ROC curve, also known as C-statistic) than any one of smoking, diabetes, hypertension, body mass index, self-reported high cholesterol, and family history (it does not do as well as all of them put together). They conclude that their score “strengthens the concept of using genomic information to stratify individuals for CAD risk in general populations and demonstrates the potential for genomic screening in early life to complement conventional risk prediction.”

In reaction to articles such as this, several clear lines of criticism have emerged

The scores are only applicable in the ancestral population that they were developed in. Combine this with the well publicised fact that almost all studies are on Caucasian populations (reviewed here), and that the assays used are SNP chips whose genetic variants were chosen based on frequencies of variants within European populations, and several issues are immediately apparent. As an alternative to producing scores separately for each ancestral population, a suggestion that studies based on African populations would be less biased and more generalizable to other populations. It is based on the simple fact that non-African populations have been subject to more genetic drift – i.e. change in genetic variant frequency because of small population sizes. It is also the case that there are hazards aplenty in using differences between populations to infer anything about genetics, and particularly about natural selection (see g.g. this article).
Hazard ratios have to be very high to be useful as screening tools. In an article that has been well circulated on twitter, “The illusion of polygenic disease risk prediction”, the authors point out that “the paradox is largely explained by the fact that odds ratios or hazard ratios typically compare risks in the tails of a single risk distribution, but these ratios ignore the proportions of individuals who will or will not develop the disease that fall in the region between the tails of the distribution”. The first author, Nicholas Wald, has been pointing this out for a long time. Note that this does not just apply to genomics, in his 1999 paper Wald takes as its case study cholesterol levels for heart disease, and shows how poor a screening test this is. (They state that no future polygenic risk score will produce a high enough relative risk — it would be good to check this, based on a score that captured the full heritability estimates for a given trait.) This argument ought not to be news. I enjoyed this slide deck by epidemiologist Cecile Janssens that traces the history of the prospect of predictive genomic tests, and some of the known pitfalls.
The role of the environment means that genetics is often not as useful as these scores would suggest. If the environment changes, e.g. all people stop smoking, then the polygenic scores change too. If the genetics is mediated by an independently measurable and modifiable intermediate phenotype, e.g. cholesterol levels, then it is much less useful to know the genetics. Though see the Inouye paper showing that their score is relatively independent of other known risk factors.

These skeptical voices are not preventing a full-scale rush to applications. Color announced a 100,000 person initiative to use low throughput whole genome data to provide individuals with polygenic scores.

Controversy

A new documentary about James Watson shows that his views on race have not changed. The Times reports that he had a chance to salvage his reputation on race but made things worse. The relationship between genetics and race is not straightforward, as this survey of over 500 genetics professionals on their views on race demonstrated. They found that “many genetics professionals the questions of what race is and what race means remain both professionally and personally contentious.”

Science and Applications

Antonio Regalado has summed up the top advances in Genetics from 2018 — seeing it all in one place is definitely impressive.
The latest chapter on heritability, from a study of Aetna’s database of insurance claims covering about 45m individuals. Their dataset has over 56,000 pairs of twins born since 1985, and over 700,000 sibling pairs. They connect zipcodes to environmental factors of interest — SES, air pollution and weather/climate. They found that variance from these measures was much lower than from genetics and shared environment, with obesity being the phenotype with the strongest link to SES (var=0.027). Monthly cost of data was estimated at 29% heritable and 30% due to shared environment. The respective figures for co-morbidities were 43% and 24%.
There is often concern that receiving ambiguous results can lead to increased worry for individuals. But a new study based on a sample of over 5000 women receiving HBOC genetic risk testing fond that receiving uncertain results did not increase worry among women compared to a negative result.
The BabySeq project reports on results of exome sequencing of 159 newborns (127 healthy and 32 in the NICU). Of these 15 (9.4%) had genetic variants associated with a disease that could be managed in childhood. Genomic sequencing for newborns remains a contentious area.
An AP poll found 70% of Americans supportive of genetic editing “to prevent an incurable or fatal disease a child otherwise would inherit, such as cystic fibrosis or Huntington’s disease”, about two thirds to “prevent a child from inheriting a non-fatal condition such as blindness, and even to reduce the risk of diseases that might develop later in life, such as cancers”, and about 70% oppose “using gene editing to alter capabilities such as intelligence or athletic talent, and to alter physical features such as eye color or height.” I can’t find any original data, just reports e.g. here.
I thought this was an interesting story about how much of an impact the classification of a disease can make — in this case, the efforts to have schizophrenia classified as a brain disease so that it was covered by a new CDC program. Why does this matter? Mental conditions receive less funding and health insurance is often less generous. Strong echoes of dualism here.

Regulation

Stat reports on Science with borders: A debate over genetic sequences and national rights threatens to inhibit research, reflecting the ongoing debate about the status of genetic sequences of pathogens collected in different countries.
Australian researchers have called for a Genetic Data Protection Act in reaction to police use of genetic a) familial searching, and b) prediction of phenotypes.

All The Coolest Genomics

Month: January 2019

Round-up Jan 16th- 31st

Round-up Dec 22 2018 – Jan 15 2019