There’s a general view that most folk don’t know much about science and, because almost day by day, science plays a more prominent role in our lives, that’s considered to be a Bad Thing. Us scientists are therefore always being told to get off our backsides and spread the word – and I try to do my bit in Betrayed by Nature, in Secret of Life (a new book shortly to be published) and in these follow-up blogs.
We may be making some progress – and, I have to admit, television has probably done more than me – though I am available (t.v. & movie head honchos please note). As one piece of evidence you could cite the way ‘DNA’ has become part of the universal lexicon, albeit often nonsensically. As evidence I call Sony Corp. Chief Executive Kazuo Hirai, as reported in The Wall Street Journal: “I’ve said this from day one. Some things at Sony are literally written into our DNA …”
Well, of course, that’s gibberish Kazuo old bean – but we know what you mean. Or do we? Most probably couldn’t tell you what the acronym stands for – but that doesn’t matter if they can explain that it’s the stuff (a ‘molecule’ would be better still!) that carries the information of inheritance and, as such, is responsible for all life. Go to the top of the class those who add that the code is in the form of chemicals called bases and there are just four of them (A, C, G & T). Something that simple doesn’t seem enough for all life but the secret is lies in the vast lengths of DNA involved. The human genome, for example, is made up of three billion letters.
A little bit of what is now history …
In the mid-1980s a number of scientists from around the world began to talk about the possibility of working out the sequence of letters that make up human DNA and thus identifying and mapping all the genes encoded by the human genome. From this emerged The Human Genome Project, a massive international collaboration, conceived in 1984 and completed in 2003. I quite often refer to this achievement as the ‘Greatest Revolution’ – meaning the biggest technical advance in the history of biology.
As that fantastic enterprise steadily advanced to its triumphant conclusion, it was accompanied by a series of mini-revolutions in technology that sky-rocketed the speed of sequencing and slashed the cost – the combined effect being an increase the efficiency of the whole process of more than 100 million-fold.
Brings us to the present …
These quite astonishing developments have continued since 2003 such that by 2009 it was possible to sequence 12 individuals in one study. By August 2016 groups from all over the world, coming together under the banner of The Exome Aggregation Consortium (ExAC), have raised the stakes 5,000-fold by sequencing no fewer than 60,706 individuals.
The name of the outfit tells you that there’s what you might think of as a very small swizz here: they didn’t sequence all the DNA, just the regions that code for proteins (exomes) – only about 1% of the three billion letters. But what highlights the power of current methods is not only the huge number of individuals sequenced but the depth of coverage – that is, the number of times each base (letter) in each individual exome was sequenced. In effect, it’s doing the same experiment so many times that errors are eliminated. Thus even genetic variants in just one person can be picked out.
Sequence variants between individuals. For most proteins the stretches of genomic DNA that encode their sequence are split into regions called exons. All the expressed genes in a genome make up the exome. By repeated sequencing The Exome Aggregation Consortium have shown that genetic variants in even one person can be reliably identified. Variants from the normal sequence found in four people are shown in red, bold letters.
It turns out that there are about 7.5 million variants and they pop up remarkably often – at one in every eight sites (bases). About half only occur once (which illustrates why DNA fingerprinting, aka DNA profiling, is so sensitive). As Jay Shendure put it, this gives us a “glimpse of the bottom of the well of genetic variation in humans.”
One of the major results of this study is that, by filtering out common variants from those associated with specific diseases, it will help to pin down the causes of Mendelian diseases (i.e. genetic disorders caused by change or alteration in a single gene, e.g., cystic fibrosis, haemophilia, sickle-cell anaemia, phenylketonuria). It’s clear that, over the next ten years, tens of millions of human genomes will be sequenced which will reveal the underlying causes of the thousands of genetic disorders.
The prize … and the puzzle
The technology is breathtaking, the amount of information being accumulated beyond comprehension. Needless to say, private enterprise has leapt on the bandwagon and you can now get your genome sequenced by, for example, 23andMe who offer “a personalised DNA service providing information and tools for individuals to learn about and explore their DNA. Find out if you are at risk for passing on an inherited condition, who you’re related to etc.” All for a mere $199!!
But you could say that the endpoint – the reason for grappling with DNA in the first place – is easy to see: eventually we will be able to define the molecular drivers of all genetic diseases and from that will follow ever improving methods of treatment and prevention.
Nevertheless, in that wonderful world I suspect we will still find ourselves brought up short by the underlying question: how one earth does DNA manage to carry the information necessary for all life?
For those who like to ponder such things, in the next piece we’ll try to help by looking at DNA from a different angle.
Ng, SB. et al. (2009). Nature 461, 272-276.
Lek, M. et al. (2016). Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291.