Focus on polygenic risk scores
First, let’s look at a paradigmatic example of a polygenic score publication. Inouye et al constructed a Polygenic Risk Score for CAD, gaining a hazard ratio of 4.17 for those in the top 20% compared to bottom 20% of their score. It is better as a single predictor (based on Area Under ROC curve, also known as C-statistic) than any one of smoking, diabetes, hypertension, body mass index, self-reported high cholesterol, and family history (it does not do as well as all of them put together). They conclude that their score “strengthens the concept of using genomic information to stratify individuals for CAD risk in general populations and demonstrates the potential for genomic screening in early life to complement conventional risk prediction.”
In reaction to articles such as this, several clear lines of criticism have emerged
- The scores are only applicable in the ancestral population that they were developed in. Combine this with the well publicised fact that almost all studies are on Caucasian populations (reviewed here), and that the assays used are SNP chips whose genetic variants were chosen based on frequencies of variants within European populations, and several issues are immediately apparent. As an alternative to producing scores separately for each ancestral population, a suggestion that studies based on African populations would be less biased and more generalizable to other populations. It is based on the simple fact that non-African populations have been subject to more genetic drift – i.e. change in genetic variant frequency because of small population sizes. It is also the case that there are hazards aplenty in using differences between populations to infer anything about genetics, and particularly about natural selection (see g.g. this article).
- Hazard ratios have to be very high to be useful as screening tools. In an article that has been well circulated on twitter, “The illusion of polygenic disease risk prediction”, the authors point out that “the paradox is largely explained by the fact that odds ratios or hazard ratios typically compare risks in the tails of a single risk distribution, but these ratios ignore the proportions of individuals who will or will not develop the disease that fall in the region between the tails of the distribution”. The first author, Nicholas Wald, has been pointing this out for a long time. Note that this does not just apply to genomics, in his 1999 paper Wald takes as its case study cholesterol levels for heart disease, and shows how poor a screening test this is. (They state that no future polygenic risk score will produce a high enough relative risk — it would be good to check this, based on a score that captured the full heritability estimates for a given trait.) This argument ought not to be news. I enjoyed this slide deck by epidemiologist Cecile Janssens that traces the history of the prospect of predictive genomic tests, and some of the known pitfalls.
- The role of the environment means that genetics is often not as useful as these scores would suggest. If the environment changes, e.g. all people stop smoking, then the polygenic scores change too. If the genetics is mediated by an independently measurable and modifiable intermediate phenotype, e.g. cholesterol levels, then it is much less useful to know the genetics. Though see the Inouye paper showing that their score is relatively independent of other known risk factors.
These skeptical voices are not preventing a full-scale rush to applications. Color announced a 100,000 person initiative to use low throughput whole genome data to provide individuals with polygenic scores.
- A new documentary about James Watson shows that his views on race have not changed. The Times reports that he had a chance to salvage his reputation on race but made things worse. The relationship between genetics and race is not straightforward, as this survey of over 500 genetics professionals on their views on race demonstrated. They found that “many genetics professionals the questions of what race is and what race means remain both professionally and personally contentious.”
Science and Applications
- Antonio Regalado has summed up the top advances in Genetics from 2018 — seeing it all in one place is definitely impressive.
- The latest chapter on heritability, from a study of Aetna’s database of insurance claims covering about 45m individuals. Their dataset has over 56,000 pairs of twins born since 1985, and over 700,000 sibling pairs. They connect zipcodes to environmental factors of interest — SES, air pollution and weather/climate. They found that variance from these measures was much lower than from genetics and shared environment, with obesity being the phenotype with the strongest link to SES (var=0.027). Monthly cost of data was estimated at 29% heritable and 30% due to shared environment. The respective figures for co-morbidities were 43% and 24%.
- There is often concern that receiving ambiguous results can lead to increased worry for individuals. But a new study based on a sample of over 5000 women receiving HBOC genetic risk testing fond that receiving uncertain results did not increase worry among women compared to a negative result.
- The BabySeq project reports on results of exome sequencing of 159 newborns (127 healthy and 32 in the NICU). Of these 15 (9.4%) had genetic variants associated with a disease that could be managed in childhood. Genomic sequencing for newborns remains a contentious area.
- An AP poll found 70% of Americans supportive of genetic editing “to prevent an incurable or fatal disease a child otherwise would inherit, such as cystic fibrosis or Huntington’s disease”, about two thirds to “prevent a child from inheriting a non-fatal condition such as blindness, and even to reduce the risk of diseases that might develop later in life, such as cancers”, and about 70% oppose “using gene editing to alter capabilities such as intelligence or athletic talent, and to alter physical features such as eye color or height.” I can’t find any original data, just reports e.g. here.
- I thought this was an interesting story about how much of an impact the classification of a disease can make — in this case, the efforts to have schizophrenia classified as a brain disease so that it was covered by a new CDC program. Why does this matter? Mental conditions receive less funding and health insurance is often less generous. Strong echoes of dualism here.
- Stat reports on Science with borders: A debate over genetic sequences and national rights threatens to inhibit research, reflecting the ongoing debate about the status of genetic sequences of pathogens collected in different countries.
- Australian researchers have called for a Genetic Data Protection Act in reaction to police use of genetic a) familial searching, and b) prediction of phenotypes.