Junk Store Opened: Millions of Bargains

Many moons ago, when I was nobbut a lad and sequencing the human genome was 30 years away, we nevertheless knew that there was something very odd about our genetic code. We knew there were three thousand million base pairs but that only a tiny fraction of that (a few percent) was necessary to encode all the proteins found in our bodies. What was the rest doing? As a sort of explanation two terms came into vogue: ‘selfish DNA’ (meaning stuff that just reproduced itself because it was there) and ‘junk DNA’ meaning everything that didn’t code for proteins.

One of the few predictions I’ve made that turned out to be right was embodied in a refusal to use either term – and if there’s anyone who can recall anything of my supervisions (that is, what the rest of the world calls tutorials) they might back me up on this. It’s true that, as time went by, we increasingly appreciated that non-coding DNA is important in controlling whether individual genes are switched on or off – that is, whether they make RNA and from that protein, according to sequences embedded in the DNA, or whether they make nothing.

Ewen's scheme

However, getting a real grip on what all that seemingly spare DNA is doing has turned out to be so challenging that it is only now, 10 years after the first human sequence was produced, that we have hard data to go on. That unveiling has come from a follow-up called the ENCODE (Encyclopedia Of DNA Elements) programme – an international cooperative of extraordinary scale, with its heart at The Sanger Centre just outside Cambridge and with its head one Ewan Birney. Birney is a computational biologist – a new breed of scientist whose strength lies in bringing to bear methods that make sense of the vast amounts of data generated by current DNA sequencing techniques.

A glance at the summary of what ENCODE involved suggests that, in the unlikely event of his getting bored with science, Birney would make a pretty good fist as Secretary-General of the United Nations. I’d like to try and persuade you that scientists are wonderful and lofty forms of our species but, alas, in fact they are generally ambitious, driven, self-centred, ruthless and intolerant. To make matters worse, quite a few are very smart. To get nearly 500 of the world’s best to sink self-interest and focus on one aim in a multi-national, multi-lingual, multi-racial collaboration that requires rigorous assessment of data and in which the scope for individual glory is almost negligible might well qualify as the greatest feat of man-management in the history of the human race.

So Birney’s a star but what did the world get for its money? The short answer is that we now know that, far from being ‘junk’, most of our DNA – over 80% – does something useful. Whilst only 1.6% carries protein-coding genes, much of the rest is important in regulating the activity of proteins generated from coding genes. The regulatory activity comes in the form of RNA: as we noted just now, DNA makes RNA makes protein – and the DNA sequences involved are called genes. But there’s a second class of genes, ones that transcribe DNA sequence into RNA – but then things stop. The RNA doesn’t go on to direct the making of proteins but rather goes off and regulates well, almost everything. So this second group are non-coding genes – because they don’t ‘make’ proteins.

How does the RNA of non-coding genes work? Well, in essence by sticking to other RNAs and to proteins themselves. What ENCODE has revealed is a panoply of types of RNA that comes in a wide range of sizes and has a finger in almost every bit of the cellular pie. So these varied RNAs act as cellular controllers at many levels and because cancers result from the subversion of normal control you would correctly guess that mutations in non-coding genes can be every bit as important as those that affect protein function directly.

Does this help in dealing with cancer and are there any bargains in the junk store? The short-term answers are ‘no’ and ‘lots – in theory’. As units of this army of RNAs help to control how we work normally, they also can go wrong – become mutated – so we have a new set of potential players in the cancer game. Detecting when individual RNAs join in won’t be so difficult: the real cancer challenge now is not target-spotting, it’s making the bullets to hit the targets.


Maher, B. (2012). ENCODE: The human encyclopaedia. Nature 489, 46-48.

Birney, E. (2012). The making of ENCODE: Lessons for big-data projects. Nature 489, 49-51.

Genetic Roulette in a New World

In 2003 it was a sensation. No really – it’s probably true that in medicine only the first human heart transplant operation back in 1967 has generated as much publicity. That was in the pre-web dark age but, nevertheless, the South African surgeon Christiaan Barnard was immortalized as a global hero: even the patient’s name was on everyone’s lips (Louis Washkansky if you’re struggling to recall) and you can re-live the whole event at the Groote Schuur Hospital museum in Capetown. But, although 2003 was just a decade ago, in today’s world sensations fade almost with the following dawn, whether they are pop groups or life-changing scientific advances.

So if now you mention “The Human Genome Project” to a man on the Clapham omnibus you are likely to elicit only a puzzled look. What happened in 2003 was of course that the genetic code – that is the sequence of bases in DNA – was revealed for the entire human genome. And an astonishing triumph it was, not least because, in contrast to almost everything else in history with a major British component, it was completed within schedule and under cost.

The feat was deservedly greeted with a fanfare of public interest unprecedented for any scientific project short of the early space missions. President Clinton in the White House was hooked-up live to whoever was living in No. 10 at the time, the leading British scientists in this amazing project dropped in for tea and Mike Dexter, then Chairman of The Wellcome Trust and a restrained and conservative fellow – being a scientist – described it somewhat inelegantly as “… the outstanding achievement not only of our lifetime, but in terms of human history.”

The Sanger Centre, Cambridge

The Genome Analysis Centre, Norwich

The Genome Institute at Washington University

However, even more remarkable is what happened next. The ensuing decade has brought technical advances so breathtaking as to almost overshadow the original human genome project itself. This quite staggering revolution has seen the introduction of fully automated, high throughput flow cells that simultaneously carry out hundreds of millions of separate sequencing reactions – just say that slowly. In the jargon it’s called ‘massively parallel sequencing’. The upshot of this stunning technology is that sequencing speed has gone up by 100 million times whilst, almost unbelievably, the cost has dropped by a factor of 10,000. Even computing science can’t match that progress!

One consequence of this incredible, though relatively unpublicised, revolution is that genomes can be now be sequenced on an industrial scale and in the years to come that is going to impact on every facet of mankind’s existence. Thus far the field of cancer has been the foremost recipient of this technological broadside with thousands of tumour genomes now sequenced. This has unveiled the almost incomprehensible panoply of genetic changes that cells can sustain and yet emerge still capable of proliferating. One of the first cancer genomes to be sequenced was that of a female who had died from leukemia. The work was carried out by The Genome Institute at Washington University in St. Louis, Missouri and since then, under its Director Richard Wilson, this group has continued to be a world leader in genomics and in particular in unravelling the extraordinary complexity of the group of cancers collectively called leukemias.

Wilson and his colleagues know, of course, that they are at the forefront of the most extraordinary transformation in medicine – because eventually it will affect everyone –though Rick Wilson himself is as improbable a revolutionary as you could imagine: a gentle, soft-spoken American, he’s what on this side of the pond would be called a thoroughly nice chap.

However, if they had any doubts about the direction in which their science was leading the world, these would have been dispelled when one of their own community, Lukas Wartman, was diagnosed with a very rare form of leukemia. This had first appeared ten years ago when Lukas was a student completing his medical degree at Washington University, and at that time it had been treated with chemotherapy and a bone-marrow transplant.

In the following years, Dr. Wartman had pursued his career goal of becoming a practicing oncologist specializing in leukemia until, in July 2011 the disease returned and he went into relapse. As his condition deteriorated rapidly and only one outcome seemed possible, those treating him turned in desperation from conventional approaches to local expertise. They applied genomic analysis to his cancer cells. From the vast number of disruptions identified, one in particular stood out: an abnormally expressed gene that had previously been associated with other types of leukemia but is very rare in the form Wartman had developed.

By an unlikely chance there is a drug available that can knock out the activity of the protein made by that gene. Its effect was phenomenal, restoring the normal blood count and achieving complete remission. This wonderful outcome does not mean that Dr. Wartman is cured for life – but for now he is alive and well – and a co-author of the group’s latest paper – on leukemia.

He had been a desperately unlucky in that the genetic roulette that is life generated in him a hand of mutations that drove the development of a rare and almost invariably lethal form of leukemia. But life also smiled on Lukas Wartman in that circumstances found him at the heart of the genomics revolution that is ushering in a new world of medicine. His isn’t the first life to be saved through the use of this fabulous technology but he is one of the first few who will, in years to come, be followed by many as these marvellous methods for diagnosis and the design of treatment come into widespread use.