No It Isn’t!

 

It’s great that newspapers carry the number of science items they do but, as regular readers will know, there’s nothing like the typical cancer headline to get me squawking ‘No it isn’t!” Step forward The Independent with the latest: “Major breakthrough in cancer care … groundbreaking international collaboration …”

Let’s be clear: the subject usually is interesting. In this case it certainly is and it deserves better headlines.

So what has happened?

A big flurry of research papers has just emerged from a joint project of the National Cancer Institute and the National Human Genome Research Institute to make something called The Cancer Genome Atlas (TCGA). This massive initiative is, of course, an offspring of the Human Genome Project, the first full sequencing of the 3,000 million base-pairs of human DNA, completed in 2003. The intervening 15 years have seen a technical revolution, perhaps unparalled in the history of science, such that now genomes can be sequenced in an hour or two for a few hundred dollars. TCGA began in 2006 with the aim of providing a genetic data-base for three cancer types: lung, ovarian, and glioblastoma. Such was its success that it soon expanded to a vast, comprehensive dataset of more than 11,000 cases across 33 tumor types, describing the variety of molecular changes that drive the cancers. The upshot is now being called the Pan-Cancer Atlas — PanCan Atlas, for short.

What do we need to know?

Fortunately not much of the humungous amounts of detail but the scheme below gives an inkling of the scale of this wonderful endeavour — it’s from a short, very readable summary by Carolyn Hutter and Jean Claude Zenklusen.

TCGA by numbers. The scale of the effort and output from The Cancer Genome Atlas. From Hutter and Zenklusen, 2018.

The first point is obvious: sequencing 11,000 paired tumour and normal tissue samples produced mind-boggling masses of data. 2.5 petabytes, in fact. If you have to think twice about your gigas and teras, 1 PB = 1,000,000,000,000,000 B, i.e. 1015 B or 1000 terabytes. A PB is sometimes called, apparently, a quadrillion — and, as the scheme helpfully notes, you’d need over 200,000 DVDs to store it.

The 33 different tumour types included all the common cancers (breast, bowel, lung, prostate, etc.) and 10 rare types.

The figure of seven data types refers to the variety of information accumulated in these studies (e.g., mutations that affect genes, epigenetic changes (DNA methylation), RNA and protein expression, duplication or deletion of stretches of DNA (copy number variation), etc.

After which it’s worth pausing for a moment to contemplate the effort and organization involved in collecting 11,000 paired samples, sequencing them and analyzing the output. It’s true that sequencing itself is now fairly routine, but that’s still an awful lot of experiments. But think for even longer about what’s gone into making some kind of sense of the monstrous amount of data generated.

And it’s important because?

The findings confirm a trend that has begun to emerge over the last few years, namely that the classification of cancers is being redefined. Traditionally they have been grouped on the basis of the tissue of origin (breast, bowel, etc.) but this will gradually be replaced by genetic grouping, reflecting the fact that seemingly unrelated cancers can be driven by common pathways.

The most encouraging thing to come out of the genetic changes driving these tumours is that for about half of them potential treatments are already available. That’s quite a surprise but it doesn’t mean that hitting those targets will actually work as anti-cancer strategies. Nevertheless, it’s a cheering point that the output of this phenomenal project may, as one of the papers noted, serve as a launching pad for real benefit in the not too distant future.

What should science journalists do to stop upsetting me?

Read the papers they comment on rather than simply relying on press releases, never use the words ‘breakthrough’ or ‘groundbreaking’ and grasp the point that science proceeds in very small steps, not always forward, governed by available methods. This work is quite staggering for it is on a scale that is close to unimaginable and, in the end, it will lead to treatments that will affect the lives of almost everyone — but it is just another example of science doing what science does.

References

Hutter, C. and Zenklusen, J.C. (2018). The Cancer Genome Atlas: Creating Lasting Value beyond Its Data. Cell 173, 283–285.

Hoadley, K.A. et al. (2018). Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types of Cancer. Cell 173, 291–304.

Hoadley, K.A. et al. (2014). Multiplatform Analysis of 12 Cancer Types Reveals Molecular Classification within and across Tissues of Origin. Cell 158, 929–944.

Advertisements

John Sulston: Biologist, Geneticist and Guardian of our Heritage

 

Sir John Sulston died on 6 March 2018, an event reported world-wide by the press, radio and television. Having studied in Cambridge and then worked at the Salk Institute in La Jolla, California, he joined the Laboratory of Molecular Biology in Cambridge to investigate how genes control development and behaviour, using as a ‘model organism’ the roundworm Caenorhabditis elegans. This tiny creature, 1 mm long, was appealing because it is transparent and most adult worms are made up of precisely 959 cells. Simple it may be but this worm has all the bits required for to live, feed and reproduce (i.e. a gut, a nervous system, gonads, intestine, etc.). For his incredibly painstaking efforts in mapping from fertilized egg to mature animal how one cell becomes two, two becomes four and so on to complete the first ‘cell-lineage tree’ of a multicellular organism, Sulston shared the 2002 Nobel Prize in Physiology or Medicine with Bob Horvitz and Sydney Brenner.

Sir John Sulston

It became clear to Sulston that the picture of how genes control development could not be complete without the corresponding sequence of DNA, the genetic material. The worm genome is made up of 100 million base-pairs and in 1983 Sulston set out to sequence the whole thing, in collaboration with Robert Waterston, then at the University of Washington in St. Louis. This was a huge task with the technology available but their success indicated that the much greater prize of sequencing of the human genome — ten times as much DNA as in the worm — might be attainable.

In 1992 Sulston became head of a new sequencing facility, the Sanger Centre (now the Sanger Institute), in Hinxton, Cambridgeshire that was the British component of the Human Genome Project, one of the largest international scientific operations ever undertaken. Astonishingly, the complete human genome sequence, finished to a standard of 99.99% accuracy, was published in Nature in October 2004.

As the Human Genome Project gained momentum it found itself in competition with a private venture aimed at securing the sequence of human DNA for commercial profit — i.e., the research community would be charged for access to the data. Sulston was adamant that our genome belonged to us all and with Francis Collins — then head of the US National Human Genome Research Institute — he played a key role in establishing the principle of open access to such data, preventing the patenting of genes and ensuring that the human genome was placed in the public domain.

One clear statement of this intent was that, on entering the Sanger Centre, you were met by a continuously scrolling read-out of human DNA sequence as it emerged from the sequencers.

In collaboration with Georgina Ferry, Sulston wrote The Common Thread, a compelling account of an extraordinary project that has, arguably, had a greater impact than any other scientific endeavour.

For me and my family John’s death was a heavy blow. My wife, Jane, had worked closely with him since inception of the Sanger Centre and not only had his scientific influence been immense but he had also become a staunch friend and source of wisdom. At the invitation of John’s wife Daphne, a group of friends and relatives gathered at their house after the funeral. As darkness fell we went into the garden and once again it rang to the sound of chatter and laughter from young and old as we enjoyed one of John’s favourite party pastimes — making hot-air lanterns and launching them to drift, flickering to oblivion, across the Cambridgeshire countryside. John would have loved it and it was a perfect way to remember him.

Then …

When John Sulston set out to ‘map the worm’ the tools he used could not have been more basic: a microscope — with pencil and paper to sketch what he saw as the animal developed. His hundreds of drawings tracked the choreography of the worm to its final 959 cells and showed that, along the way, 131 cells die in a precisely orchestrated programme of cell death. The photomontage and sketch below are from his 1977 paper with Bob Horvitz and give some idea of the effort involved.

Photomontage of a microscope image (top) and (lower) sketch of the worm Caenorhabditis elegans showing cell nuclei. From Sulston and Horvitz, 1977.

 … and forty years on

It so happened that within a few days of John’s death Achim Trubiroha and colleagues at the Université Libre de Bruxelles published a remarkable piece of work that is really a descendant of his pioneering studies. They mapped the development of cells from egg fertilization to maturity in a much bigger animal than John’s worms — the zebrafish. They focused on one group of cells in the early embryo (the endoderm) that develop into various organs including the thyroid. Specificially they tracked the formation of the thyroid gland that sits at the front of the neck wrapped around part of the larynx and the windpipe (trachea). The thyroid can be affected by several diseases, e.g., hyperthyroidism, and in about 5% of people the thyroid enlarges to form a goitre — usually caused by iodine deficiency. It’s essential to determine the genes and signalling pathways that control thyroid development if we are to control these conditions.

For this mapping Trubiroha’s group used the CRISPR method of gene editing to mutate or knock out specific targets and to tag cells with fluorescent labels — that we described in Re-writing the Manual of Life.

A flavor of their results is given by the two sets of fluorescent images below. These show in real time the formation of the thyroid after egg fertilization and the effect of a drug that causes thyroid enlargement.

Live imaging of transgenic zebrafish to follow thyroid development in real-time (left). Arrows mark chord-like cell clusters that form hormone-secreting follicles (arrowheads) during normal development. The right hand three images show normal development (-) and goiter formation (+) induced by a drug. From Trubiroha et al. 2018.

John would have been thrilled by this wonderful work and, with a chuckle, I suspect he’d have said something like “Gosh! If we’d had gene editing back in the 70s we’d have mapped the worm in a couple of weeks!”

References

International Human Genome Sequencing Consortium Nature 431, 931–945; 2004.

John Sulston and Georgina Ferry The Common Thread: A Story of Science, Politics, Ethics and the Human Genome (Bantam Press, 2002).

Sulston, J.E. and Horvitz, H.R. (1977). Post-embryonic Cell Lineages of the Nematode, Caenorhabitis elegans. Development Biology 56, 110-156.

Trubiroha, A. et al. (2018). A Rapid CRISPR/Cas-based Mutagenesis Assay in Zebrafish for Identification of Genes Involved in Thyroid Morphogenesis and Function. Scientific Reports 8, Article number: 5647.

Bonkers Really … but …

 

This is just in case you spotted the headline in January 2018: ‘Scientists Counted All The Protein Molecules in a Cell And The Answer Really Is 42. This is so perfect.’ 

Them scientists eh! The things they get up to!! The scallywags in this case were Brandon Ho & chums from the University of Toronto and Signe Dean, the journalist who came up with the headline, was referring, of course, to Douglas Adams’s “Answer to the Ultimate Question of Life …” in The Hitchhiker’s Guide to the Galaxy — though it may be noted that Ho’s paper includes neither the number 42 nor mention of Douglas Adams.

The cult that has evolved around this number is both amusing and bizarre, not least because Adams himself explained that he dreamed 42 up out of the blue. In a different context a while ago (talking about how the way you get to work might affect your life expectancy) I recounted happy evenings spent carousing in The Baron (well, having a quiet jar or two) with Douglas Adams and friends from which it was clear that he was not into abstruse mathematics, astrology or the occult. He just had a vivid imagination.

Anything for a catchy headline but

Aside from the whimsy, is there anything interesting in this paper? Well, yes. Ho & Co studied a type of yeast (Saccharomyces cerevisiae) that is mighty important because it’s been a foundation for brewing and baking since ancient times. So no merry sessions in The Baron of Beef without it! Its cells are about the same size as red blood cells (5–10 microns in diameter) but you can actually see them sometimes as films on the skin of fruit. It’s played a huge role in biology as a ‘model organism’ for studying how we work because the proteins it makes that are essential for life are pretty well identical to those in human cells — so much so that you can swap those that control cell growth and division between the two. Yeast proteins work just fine in human cells and vice versa.

 

Yeast on the skin of a grape. Photo: Barbara W. Beacham

 

The question Ho & Co asked was ‘how many protein molecules are there in one cell?’ In the age when you can sequence the DNA of practically anything at the drop of a hat, you might think we’d know the answer already but in fact it’s not been at all clear. Accordingly, what these authors did was to pull together all the relevant studies that have been done to come up with an absolute figure. The answer that emerged was that the number of protein molecules per yeast cell is 4.2 x 107 — which, of course, can also be written as 42 million. Eureka! We have our headline!! Albeit, as the authors noted, with a two-fold error range.

Does anyone care?

Now you’re just being awkward. You should be grateful to be made to picture for a moment tens of millions of proteins jiggling around in little sacs so small you could get tens of thousands of these cells on the head of a pin. And somehow, in that heaving molecular city, each protein manages to carry out its own task so that the cell works. It is quite staggering.

Mention of tasks leads to the other question Ho et al looked at: how many copies are there of the different types of protein? We know from its DNA sequence that this yeast has about 6,000 genes (Saccharomyces Genome Database). So that’s at least 6,000 different proteins. Not surprisingly, it turns out that about two thirds of them are in the middle in terms of abundance — i.e. there’s between 1,000 and 10,000 molecules of each sort per cell. The rest are either low abundance (up to about 800 molecules per cell) or at the high end — 140,000 to 750,000, i.e. somewhere in the region of half a million copies of each type of protein.

Does this distribution make sense in terms of what these proteins do?

You know the answer because if it didn’t the Toronto team wouldn’t have got their work published but, indeed, proteins present in large numbers are, for example, part of the machinery that makes new proteins (so they’re slaving away all the time) whereas, those present in small numbers do things like repair and replicate DNA and drive cells to divide — important jobs but ones that are only intermittently needed.

These results aren’t going to turn science on its head but it is awe-inspiring when a piece of work really brings us face-to-face with stunning complexity of biology. And if it takes a bonkers headline to catch our eye, so be it!

Reference

Ho, B. et al. (2018). Unification of Protein Abundance Datasets Yields a Quantitative Saccharomyces cerevisiae Proteome. Cell Systems. Published online: January 23, 2018.

Desperately SEEKing …

These days few can be unaware that cancers kill one in three of us. That proportion has crept up over time as life expectancy has gone up — cancers are (mainly) diseases of old age. Even so, they plagued the ancients as Egyptian scrolls dating from 1600 BC record and as their mummified bodies bear witness. Understandably, progress in getting to grips with the problem was slow. It took until the nineteenth century before two great French physicians, Laënnec and Récamier, first noted that tumours could spread from their initial site to other locations where they could grow as ‘secondary tumours’. Munich-born Karl Thiersch showed that ‘metastasis’ occurs when cells leave the primary site and spread through the body. That was in 1865 and it gradually led to the realisation that metastasis was a key problem: many tumours could be dealt with by surgery, if carried out before secondary tumours had formed, but once metastasis had taken hold … With this in mind the gifted American surgeon William Halsted applied ever more radical surgery to breast cancers, removing tissues to which these tumors often spread, with the aim of preventing secondary tumour formation.

Early warning systems

Photos of Halsted’s handiwork are too grim to show here but his logic could not be faulted for metastasis remains the cause of over 90% of cancer deaths. Mercifully, rather than removing more and more tissue targets, the emphasis today has shifted to tumour detection. How can they be picked up before they have spread?

To this end several methods have become familiar — X-rays, PET (positron emission tomography, etc) — but, useful though these are in clinical practice, they suffer from being unable to ‘see’ small tumours (less that 1 cm diameter). For early detection something completely different was needed.

The New World

The first full sequence of human DNA (the genome), completed in 2003, opened a new era and, arguably, the burgeoning science of genomics has already made a greater impact on biology than any previous advance.

Tumour detection is a brilliant example for it is now possible to pull tumour cell DNA out of the gemisch that is circulating blood. All you need is a teaspoonful (of blood) and the right bit of kit (silicon chip technology and short bits of artificial DNA as bait) to get your hands on the DNA which can then be sequenced. We described how this ‘liquid biopsy’ can be used to track responses to cancer treatment in a quick and non–invasive way in Seeing the Invisible: A Cancer Early Warning System?

If it’s brilliant why the question mark?

Two problems really: (1) Some cancers have proved difficult to pick up in liquid biopsies and (2) the method didn’t tell you where the tumour was (i.e. in which tissue).

The next step, in 2017, added epigenetics to DNA sequencing. That is, a programme called CancerLocator profiled the chemical tags (methyl groups) attached to DNA in a set of lung, liver and breast tumours. In Cancer GPS? we described this as a big step forward, not least because it detected 80% of early stage cancers.

There’s still a pesky question mark?

Rather than shrugging their shoulders and saying “that’s science for you” Joshua Cohen and colleagues at Johns Hopkins University School of Medicine in Baltimore and a host of others rolled their sleeves up and made another step forward in the shape of CancerSEEK, described in the January 18 (2018) issue of Science.

This added two new tweaks: (1) for DNA sequencing they selected a panel of 16 known ‘cancer genes’ and screened just those for specific mutations and (2) they included proteins in their analysis by measuring the circulating levels of 10 established biomarkers. Of these perhaps the most familiar is cancer antigen 125 (CA-125) which has been used as an indicator of ovarian cancer.

Sensitivity of CancerSEEK by tumour type. Error bars represent 95% confidence intervals (from Cohen et al., 2018).

The figure shows a detection rate of about 70% for eight cancer types in 1005 patients whose tumours had not spread. CancerSEEK performed best for five types (ovary, liver, stomach, pancreas and esophagus) that are difficult to detect early.

Is there still a question mark?

Of course there is! It’s biology — and cancer biology at that. The sensitivity is quite low for some of the cancers and it remains to be seen how high the false positive rate goes in larger populations than 1005 of this preliminary study.

So let’s leave the last cautious word to my colleague Paul Pharoah: “I do not think that this new test has really moved the field of early detection very far forward … It remains a promising, but yet to be proven technology.”

Reference

D. Cohen et al. (2018). Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 10.1126/science.aar3247.

Lorenzo’s Oil for Nervous Breakdowns

 

A Happy New Year to all our readers – and indeed to anyone who isn’t a member of that merry band!

What better way to start than with a salute to the miracles of modern science by talking about how the lives of a group of young boys have been saved by one such miracle.

However, as is almost always the way in science, this miraculous moment is merely the latest step in a long journey. In retracing those steps we first meet a wonderful Belgian – so, when ‘name a famous Belgian’ comes up in your next pub quiz, you can triumphantly produce him as a variant on dear old Eddy Merckx (of bicycle fame) and César Franck (albeit born before Belgium was invented). As it happened, our star was born in Thames Ditton (in 1917: his parents were among the one quarter of a million Belgians who fled to Britain at the beginning of the First World War) but he grew up in Antwerp and the start of World War II found him on the point of becoming qualified as a doctor at the Catholic University of Leuven. Nonetheless, he joined the Belgian Army, was captured by the Germans, escaped, helped by his language skills, and completed his medical degree.

Not entirely down to luck

This set him off on a long scientific career in which he worked in major institutes in both Europe and America. He began by studying insulin (he was the first to suggest that insulin lowered blood sugar levels by prompting the liver to take up glucose), which led him to the wider problems of how cells are organized to carry out the myriad tasks of molecular breaking and making that keep us alive.

The notion of the cell as a kind of sac with an outer membrane that protects the inside from the world dates from Robert Hooke’s efforts with a microscope in the 1660s. By the end of the nineteenth century it had become clear that there were cells-within-cells: sub-compartments, also enclosed by membranes, where special events took place. Notably these included the nucleus (containing DNA of course) and mitochondria (sites of cellular respiration where the final stages of nutrient breakdown occurs and the energy released is transformed into adenosine triphosphate (ATP) with the consumption of oxygen).

In the light of that history it might seem a bit surprising that two more sub-compartments (‘organelles’) remained hidden until the 1950s. However, if you’re thinking that such a delay could only be down to boffins taking massive coffee breaks and long vacations, you’ve never tried purifying cell components and getting them to work in test-tubes. It’s a process called ‘cell fractionation’ and, even with today’s methods, it’s a nightmare (sub-text: if you have to do it, give it to a Ph.D. student!).

By this point our famous Belgian had gathered a research group around him and they were trying to dissect how insulin worked in liver cells. To this end they (the Ph.D. students?!) were using cell fractionation and measuring the activity of an enzyme called acid phosphatase. Finding a very low level of activity one Friday afternoon, they stuck the samples in the fridge and went home. A few days later some dedicated soul pulled them out and re-measured the activity discovering, doubtless to their amazement, that it was now much higher!

In science you get odd results all the time – the thing is: can you repeat them? In this case they found the effect to be absolutely reproducible. Leave the samples a few days and you get more activity. Explanation: most of the enzyme they were measuring was contained within a membrane-like barrier that prevented the substrate (the chemical that the enzyme reacts with) getting to the enzyme. Over a few days the enzyme leaked through the barrier and, lo and behold, now when you measured activity there was more of it!

Thus was discovered the ‘lysosome’ – a cell-within-a cell that we now know is home to an array of some 40-odd enzymes that break down a range of biomolecules (proteinsnucleic acidssugars and lipids). Our self-effacing hero said it was down to ‘chance’ but in science, as in other fields of life, you make your own luck – often, as in this case, by spotting something abnormal, nailing it down and then coming up with an explanation.

In the last few years lysosomes have emerged as a major player in cancer because they help cells to escape death pathways. Furthermore, they can take up anti-cancer drugs, thereby reducing potency. For these reasons they are the focus of great interest as a therapeutic target.

Lysosomes in cells revealed by immunofluorescence.

Antibody molecules that stick to specific proteins are tagged with fluorescent labels. In these two cells protein filaments of F-actin that outline cell shape are labelled red. The green dots are lysosomes (picked out by an antibody that sticks to a lysosome protein, RAB9). Nuclei are blue (image: ThermoFisher Scientific).

Play it again Prof!

In something of a re-run of the lysosome story, the research team then found itself struggling with several other enzymes that also seemed to be shielded from the bulk of the cell – but the organelle these lived in wasn’t a lysosome – nor were they in mitochondria or anything else then known. Some 10 years after the lysosome the answer emerged as the ‘peroxisome’ – so called because some of their enzymes produce hydrogen peroxide. They’re also known as ‘microbodies’ – little sacs, present in virtually all cells, containing enzymatic goodies that break down molecules into smaller units. In short, they’re a variation on the lysosome theme and among their targets for catabolism are very long-chain fatty acids (for mitochondriacs the reaction is β-oxidation but by a different pathway to that in mitochondria).

Peroxisomes revealed by immunofluorescence.

As in the lysosome image, F-actin is red. The green spots here are from an antibody that binds to a peroxisome protein (PMP70). Nuclei are blue (image: Novus Biologicals)

Cell biology fans will by now have worked out that our first hero in this saga of heroes is Christian de Duve who shared the 1974 Nobel Prize in Physiology or Medicine with Albert Claude and George Palade.

A wonderful Belgian. Christian de Duve: physician and Nobel laureate.

Hooray!

Fascinating and important stuff – but nonetheless background to our main story which, as they used to say in The Goon Show, really starts here. It’s so exciting that, in 1992, they made a film about it! Who’d have believed it?! A movie about a fatty acid!! Cinema buffs may recall that in Lorenzo’s Oil Susan Sarandon and Nick Nolte played the parents of a little boy who’d been born with a desperate disease called adrenoleukodystrophy (ALD). There are several forms of ALD but in the childhood disease there is progression to a vegetative state and death occurs within 10 years. The severity of ALD arises from the destruction of myelin, the protective sheath that surrounds nerve fibres and is essential for transmission of messages between brain cells and the rest of the body. It occurs in about 1 in 20,000 people.

Electrical impulses (called action potentials) are transmitted along nerve and muscle fibres. Action potentials travel much faster (about 200 times) in myelinated nerve cells (right) than in (left) unmyelinated neurons (because of Saltatory conduction). Neurons (or nerve cells) transmit information using electrical and chemical signals.

The film traces the extraordinary effort and devotion of Lorenzo’s parents in seeking some form of treatment for their little boy and how, eventually, they lighted on a fatty acid found in lots of green plants – particularly in the oils from rapeseed and olives. It’s one of the dreaded omega mono-unsaturated fatty acids (if you’re interested, it can be denoted as 22:1ω9, meaning a chain of 22 carbon atoms with one double bond 9 carbons from the end – so it’s ‘unsaturated’). In a dietary combination with oleic acid  (another unsaturated fatty acid: 18:1ω9) it normalizes the accumulation of very long chain fatty acids in the brain and slows the progression of ALD. It did not reverse the neurological damage that had already been done to Lorenzo’s brain but, even so, he lived to the age of 30, some 22 years longer than predicted when he was diagnosed.

What’s going on?

It’s pretty obvious from the story of Lorenzo’s Oil that ALD is a genetic disease and you will have guessed that we wouldn’t have summarized the wonderful career of Christian de Duve had it not turned out that the fault lies in peroxisomes.

The culprit is a gene (called ABCD1) on the X chromosome (so ALD is an X-linked genetic disease). ABCD1 encodes part of the protein channel that carries very long chain fatty acids into peroxisomes. Mutations in ABCD1 (over 500 have been found) cause defective import of fatty acids, resulting in the accumulation of very long chain fatty acids in various tissues. This can lead to irreversible brain damage. In children the myelin sheath of neurons is damaged, causing neurological defects including impaired vision and speech disorders.

And the miracle?

It’s gene therapy of course and, helpfully, we’ve already seen it in action. Self Help – Part 2 described how novel genes can be inserted into the DNA of cells taken from a blood sample. The genetically modified cells (T lymphocytes) are grown in the laboratory and then infused into the patient – in that example the engineered cells carried an artificial T cell receptor that enabled them to target a leukemia.

In Gosh! Wonderful GOSH we saw how the folk at Great Ormond Street Hospital adapted that approach to treat a leukemia in a little girl.

Now David Williams, Florian Eichler, and colleagues from Harvard and many other centres around the world, including GOSH, have adapted these methods to tackle ALD. Again, from a blood sample they selected one type of cell (stem cells that give rise to all blood cell types) and then used genetic engineering to insert a complete, normal copy of the DNA that encodes ABCD1. These cells were then infused into patients. As in the earlier studies, they used a virus (or rather part of a viral genome) to get the new genetic material into cells. They choose a lentivirus for the job – these are a family of retroviruses (i.e. they have RNA genomes) that includes HIV. Specifically they used a commercial vector called Lenti-D. During the life cycle of RNA viruses their genomes are converted to DNA that becomes a permanent part of the host DNA. What’s more, lentiviruses can infect both non-dividing and actively dividing cells, so they’re ideal for the job.

In the first phase of this ongoing, multi-centre trial a total of 17 boys with ALD received Lenti-D gene therapy. After about 30 months, in results reported in October 2017, 15 of the 17 patients were alive and free of major functional disability, with minimal clinical symptoms. Two of the boys with advanced symptoms had died. The achievement of such high remission rates is a real triumph, albeit in a study that will continue for many years.

In tracing this extraordinary galaxy, one further hero merits special mention for he played a critical role in the story. In 1999 Jesse Gelsinger, a teenager, became the first person to receive viral gene therapy. This was for a metabolic defect and modified adenovirus was used as the gene carrier. Despite this method having been extensively tested in a range of animals (and the fact that most humans, without knowing it, are infected with some form of adenovirus), Gelsinger died after his body mounted a massive immune response to the viral vector that caused multiple organ failure and brain death.

This was, of course, a huge set-back for gene therapy. Despite this, the field has advanced significantly in the new century, both in methods of gene delivery (including over 400 adenovirus-based gene therapy trials) and in understanding how to deal with unexpected immune reactions. Even so, to this day the Jesse Gelsinger disaster weighs heavily with those involved in gene therapy for it reminds us all that the field is still in its infancy and that each new step is a venture into the unknown requiring skill, perseverance and bravery from all involved – scientists, doctors and patients. But what better encouragement could there be than the ALD story of young lives restored.

It’s taken us a while to piece together the main threads of this wonderful tale but it’s emerged as a brilliant example of how science proceeds: in tiny steps, usually with no sense of direction. And yet, despite setbacks, over much time, fragments of knowledge come together to find a place in the grand jigsaw of life.

In setting out to probe the recesses of metabolism, Christian de Duve cannot have had any inkling that he would build a foundation on which twenty-first century technology could devise a means of saving youngsters from a truly terrible fate but, my goodness, what a legacy!!!

References

Eichler, F. et al. (2017). Hematopoietic Stem-Cell Gene Therapy for Cerebral Adrenoleukodystrophy. The New England Journal of Medicine 377, 1630-1638.

 

A Musical Offering 

It’s generally accepted that Johann Sebastian Bach was one of the greatest, if not the greatest, musical composer of all time. In well over 1000 compositions he laid down the framework upon which rested virtually all Western music of the following 200 years. Of these works, The Musical Offering, written in 1747, is a collection of pieces based on a single theme that has been described as the most significant piano composition in history.

Along the way to becoming a unique composer, Bach married twice and sired twenty children, only ten of whom survived into adulthood. Those figures highlight another way in which JSB was something of a freak because, in 1750 when he died aged 65, the average life expectancy in Europe was under 40 years. For that reason cancers, being primarily being diseases of old age, were much less prominent then than now when, on average, we live to be over 80 and cancers account for about one in three deaths.

It’s safe to say that in the 18th century neither Bach nor anyone else knew anything of cancer yet alone that our genetic material carries tens of thousands of genes – a kind of molecular keyboard upon which cellular machinery plays to produce an output of proteins that distinguishes one cell type from another but is also continuously varying, even within individual cells. Bach would have been fascinated by this fluctuating molecular mosaic that, through the wonders of modern sequencing methods, we can display as ‘heat maps’ showing which genes are turned on (being expressed) and to what level.

Musical genes. Left: a heat map showing the pattern of genes being expressed at a given time in several different types of cell. Red: high expression level; green low expression. On the right is the same information transformed into musical notation using the Gene Expression Music Algorithm, GEMusicA (from Staege 2016).

With commendable vision a chap by the name of Martin Staege has come up with an alternative way of looking at the rather mind-blowing picture conveyed by heat maps. Staege is in the Martin Luther University of Halle-Wittenberg – appropriately as Bach’s eldest son studied at the University of Halle. His idea is that gene expression patterns can be transformed into sounds characterized by their frequency (pitch) and tone duration. In other words you can make genes play tunes – and what’s more compare the notes from different cell samples (e.g., normal and tumour cells) so that you can ‘hear’ the differences in gene expression.

Remarkable or what?!

Unsurprisingly, gene tunes sound more Alban Berg than Magic Flute, prompting the redoubtable Dr. Staege to go one step further by producing an algorithm that fits gene themes as best it can to more singable pieces – so you get a kind of difference melody. I don’t think Beethoven or Wagner would see this biological music as a threat and they might, like me, ask ‘what’s the point?’

To which, I guess, the answers are ‘It’s clever and fun’. It’s also yet another way of showing the power of DNA as an information storage medium, and making the point that in this guise it may, in due course, make a massive impact on our lives – much more mundane than musical genes but hugely more useful.

References

Staege, M. S. (2016). Gene Expression Music Algorithm-Based Characterization of the Ewing Sarcoma Stem Cell Signature. Stem Cells International
Volume 2016, Article ID 7674824, 10 pages http://dx.

Staege, M. S. (2015). A short treatise concerning a musical approach for the interpretation of gene expression data. Sci. Rep. 5, 15281.

 

 

 

 

 

 

Making Movies in DNA

Last time we reminded ourselves of one of the ways in which cancer is odd but, of course, underpinning not just cancers but all the peculiarities of life is DNA. The enduring wonder is how something so basically simple – just four slightly different chemical groups (OK, they are bases!) – can form the genetic material (the instruction book, if you like) for all life on earth. The answer, as almost everyone knows these days, is that there’s an awful lot of it in every cell – meaning that the four bases (A, C, G & T) have an essentially infinite coding capacity.

That doesn’t make it any the less wonderful but it does carry a huge implication: if something you can squeeze into a single cell can carry limitless information it must be the most powerful of all storage systems.

A picture’s worth a thousand words

We looked at the storage power of DNA a few months ago (in “How Does DNA Do It?”) and noted that its storage density is 1000 times that of flash memories, that it’s fairly easy to scan text and transform the pixels into genetic code and that, as an example, someone has already put Shakespeare’s sonnets into DNA form.

Now Seth Shipman, George Church and colleagues at Harvard have taken the field several steps forward by capturing black and white images and a short movie in DNA. Moreover they’ve managed to get these ‘DNA recordings’ taken up by living cells from which they could subsequently recover the images.

Crumbs! How did they do it?

First they used essentially the text method to encode images of a human hand: assign the four bases (A, C, G & T) to four pixel colours (this gives a grayscale image: colours can be acquired by using groups of bases for each pixel). These DNA sequences were then introduced into bacteria (specifically E. coli) by electroporation (an electrical pulse briefly opens pores in the cell membrane).

The cells treat this foreign DNA as though it was from an invading virus and switch on their CRISPR system (summarized in “Re-writing the Manual of Life”). This takes short pieces of viral DNA and inserts them into the cell’s own genome in the form of ‘spacers’ (the point being that the stored sequences confer ‘adaptive immunity’: the cell has an immunological memory so it is primed to respond effectively if it’s infected again by that viral pathogen).

In this case, however, the cells have been fooled: the ‘spacers’ generated carry encoded pictures, rather than viral signatures.

Because spacers are short it’s obvious that you’ll need lots of them to carry the information in a photo. To keep track when it comes to reassembling the picture, each DNA fragment was tagged with a barcode (and fortunately we explained cellular barcoding in “A Word From The Nerds”).

Once incorporated in the bugs the information was maintained over many bacterial generations (48 in fact) and is recoverable by high-throughput sequencing and reconstruction of the patterns using the barcodes.

And the movie bit?

Simple. In principle they used the same methods to encode sequential frames.

Pictures in DNA.

Top: Using triplets of bases to encode 21 pixel colours. Images of a human hand (top) and a horse (bottom) were captured. For the movie they used freeze frames taken in 1872 by the English photographer Eadweard Muybridge. These showed that, for a fraction of a second, a galloping horse lifts all four hooves off the ground. Seemingly this won a return for the sometime California governor, Leland Stanford (he of university-founding fame) who had put a wager on geegees doing just that. From Shipman et al., 2017. You can see the movie here.

Getting the picture clear

To recap, in case you’re wondering if this is some scientific April Fools’ prank. What Church & Co. did is scan pictures and transform pixel density into the genetic code (i.e. sequences of the four bases A, C, G & T). They then made DNA carrying these sequences, persuaded bacteria to take up the DNA and incorporate it into their own genomes and, after growing many generations of the bugs, extracted their DNA, sequenced it and reconstructed the original images. By scanning sequential frames this can be extended to movies.

It’s not science fiction – but it is pretty amazing. With a droll turn of phrase Seth Shipman said “We want to turn cells into historians” and the work does have significant implications in showing something of the scope of biological memory systems.

Won’t be long before the trendy, instead of birthday presents of electronic family photo albums, are giving small tubes of bugs!

References

Shipman, S.L., Nivala, J., Macklis, J.D. & Church, G.M. (2017). CRISPR–Cas encoding of a digital movie into the genomes of a population of living bacteria. Nature 547, 345–349.

Cancer GPS?

The thing that pretty well everyone knows about cancers is that most are furtive little blighters. They kill one in three of us but usually we don’t they’re there until they are big enough to make something go wrong in the body or to show up in our seriously inadequate screening methods. In that sense they resemble heart problems of one sort or another, where often the first indication of trouble is unexpectedly finding yourself lying on the floor.

Meanwhile, out on the highways and byways you are about 75 times less likely to be killed in an accident than you are to succumb to either cancers or circulation failure. Which is a way of saying that in the UK about 2000 of us perish on the roads each year. That it’s ‘only’ 2000 is presumably because here your assailant is anything but furtive. All you’ve got to do is side-step the juggernaut and you’ll probably live to be – well, old enough to get cancer.

Did you know, by the way, that ‘juggernaut’ is said to come from the chariots of the Jagannath Temple in Puri on the east coast of India. These are vast contraptions used to carry representations of Hindu gods on annual festival days that look as though walking pace would be too much for them. So, replace the monsters on our roads with real juggernauts! Problem largely solved!!

Flagging cancer

But to get back to cancer or, more precisely, the difficulty of seeing it. After centuries of failing to make any inroads, recent dramatic advances give hope that all is about to change. These rely on the fact that tissues shed cells – and with them DNA – into the circulation. Tumours do this too – so in effect they are scattering clues to their existence into blood. By using short stretches of artificial DNA as bait, it’s possible to fish out tumour cell DNA from a few drops of blood. That’s a pretty neat trick in itself, given we’re talking about fewer than 100 tumour cells in a sea of several billion other cells in every cubic millimeter of blood.

There are two big attractions in this ‘microfluidics’ approach. First it’s almost ‘non-invasive’ in needing only a small blood sample and, second, it is possible that indicators may be picked up long before a tumour would otherwise show up. In effect it’s taking a biochemical magnifying glass to our body to ask if there’s anything there that wouldn’t normally be present. Detect a marker and you know there’s a tumour somewhere in the body, and if the marker changes in concentration in response to a treatment, you have a monitor for how well that treatment is doing. So far, so good.

And the problem?

These ‘liquid biopsy’ methods that use just a teaspoonful of blood have been under development for several years but there has been one big cloud hanging over them. They appear to be exquisitely sensitive in detecting the presence of a cancer – by sequencing the DNA picked up – but they have not been able to pinpoint the tissue of origin. Until now.

Step forward epigenetics

Shuli Kang and colleagues at the University of California at Los Angeles and the University of Southern California have broken this impasse by turning to epigenetics. We noted in Twenty More Winks that an epigenetic modification is any change in DNA, other than in the sequence of bases (i.e. mutation), that affects how an organism develops or functions. They’re brought about by tacking small chemical groups (commonly methyl (CH3) groups) either on to some of the bases in DNA itself or on to the proteins (histones) that act like cotton reels around which DNA wraps itself. The upshot is small changes in the structure of DNA that affect gene expression. You can think of DNA methylation as a series of flags dotted along the DNA strand, decorating it in a seemingly random pattern. It isn’t random, of course, and the target for methylation is a cytosine nucleotide (C) followed by a guanine (G) in the linear DNA sequence – called a CpG site because G and C are separated by one phosphate (p). Phosphate links nucleosides together in the backbone of DNA.

Cancer cells often display abnormal DNA methylation patterns – excess methylation (hypermethylation) in some regions, reduced methylation in others – that contributes to their peculiar behavior. It’s possible to determine the methylation profile of a DNA sample (by a method called bisulfite sequencing).

Kang & Co. developed a computer program to analyse methylation profiles from solid tumours and healthy samples in public databases and compare them to patient DNA of unknown tissue origin.

The peaks represent CpG clusters that characterize normal cells (top) and a variety of cancers. The key point is that the different patterns identify the tissue of origin (from Kang, S. et al., 2017).

The program’s called CancerLocator and in this initial study it was used to test samples from patients with lung, liver or breast cancer. In the modest words of the authors, CancerLocator ‘vastly outperforms’ previous methods – mind you, they struggle to even to distinguish most cancer samples from non-cancer samples. Nevertheless, CancerLocator’s a big step forward, not least because it can detect early stage cancers with 80% accuracy.

It’s also reasonable to expect major improvements as methylation sequencing becomes more extensive and higher resolution reveals more subtle signatures. What’s more, in principle, it should be able to detect all types of cancers – meaning that, after all so many centuries we may at last have a way of side-stepping the juggernaut.

References

Kang, S. et al. (2017). CancerLocator: non-invasive cancer diagnosis and tissue-of-origin prediction using methylation profiles of cell-free DNA. Genome Biology DOI 10.1186/s13059-017-1191-5.

And Now There Are Six!!

Scientists eh! What a drag they can be! Forever coming up with new things that the rest of us have to wrap our minds around (or at least feel we should try).

Readers of these pages will know I’m periodically apt to wax rhapsodic about ‘the secret of life’ – the fact that all living things arise from just four different chemical units, A, C, G and T. Well, from now on it seems I’ll need to watch my words – or at least my letters – though maybe for a while I can leave it on the back burner in the “things that have been but not yet” category, to use the melodic prose of Christopher Fry.

Who dunnit?

The problem is down to Floyd Romesberg and his team at the Scripps Research Institute in California.

Building on a lot of earlier work, they’ve made synthetic units that stick together to form pairs – just like A-T and C-G do in double-stranded DNA. But, as these novel chemicals (X & Y) are made in the lab, the bond they form is an unnatural base pair.

Left: Two intertwined strands of DNA are held together in part by hydrogen bonds. Right top: Two such bonds (dotted lines) link adenine (A) to thymine (T); three form between guanine (G) and cytosine (C). These bases attach to sugar units (ribose) and phosphate groups (P) to form DNA chains. Right bottom: Synthetic X and Y units can also stick together and, via ribose and phosphate, become part of DNA.

After much fiddling Romesberg’s group derived E. coli microbes that would take up X and Y when they were fed to the cells as part of their normal growth medium. The cells treat X and Y like the units they make themselves (A, C, G & T) and insert them in new DNA – so a stretch of genetic code may then read: A-C-G-T-X-T-A-C-Y-A-T-… And, once part of DNA, the novel units are passed on to the next generation.

Science fiction?
If this has you thinking creation and exploitation of entirely new life forms?!!’ you’re not alone. Seemingly Romesberg is frequently asked if he’s setting up Jurassic Park but, as he points out, the modified bugs he’s created survive only as long as they’re fed X and Y so if they ‘escape’ (being bugs this would probably be down the drain rather than over a fence), they die. Cunning eh?!!

Is this coming to a gene near you?
No. It is, however, clear that more synthetic bases will be made, expanding the power of the genetic code yet further. What isn’t yet known is what the cells will make of all this. In other words, the whole point of tinkering with DNA is to modify the code to make novel proteins. In the first instance the hope is that these might be useful in disease treatment. Rather longer-term is the notion that new organisms might emerge with specific functions – e.g., bugs that break down plastic waste materials.

At the moment all this is speculation. But what is now fact is amazing enough. After 4,000 million years since the first life-forms emerged, more than five billion different species have appeared (and mostly disappeared) on earth – all based on a genetic code of just four letters.

Now, in a small lab in southern California, Mother Nature has been given an upgrade. It’s going to be fascinating to see what she does with it!

Reference

Zhang, Y. et al. (2017). Proceedings of the National Academy of Sciences 114, 1317-1322.

Through the Smokescreen

For many years I was lucky enough to teach in a cancer biology course for third year natural science and medical students. Quite a few of those guys would already be eyeing up research careers and, within just a few months, some might be working on the very topics that came up in lectures. Nothing went down better, therefore, than talking about a nifty new method that had given easy-to-grasp results clearly of direct relevance to cancer.

Three cheers then for Mikhail Denissenko and friends who in 1996 published the first absolutely unequivocal evidence that a chemical in cigarette smoke could directly damage a bit of DNA that provides a major protection against cancer. The compound bound directly to several guanines in the DNA sequence that encodes P53 – the protein often called ‘the guardian of the genome’ – causing mutations. A pity poor old Fritz Lickint wasn’t around for a celebratory drink – it was he, back in the 1930s, that first spotted the link between smoking and lung cancer.

This was absolutely brilliant for showing how proteins switched on genes – and how that switch could be perturbed by mutations – because, just a couple of years earlier, Yunje Cho’s group at the Memorial Sloan-Kettering Cancer Center in New York had made crystals of P53 stuck to DNA and used X-rays to reveal the structure. This showed that six sites (amino acids) in the centre of the P53 protein poked like fingers into the groove of double-stranded DNA.

x-ray-picCentral core of P53 (grey ribbon) binding to the groove in double-stranded DNA (blue). The six amino acids (residues) most commonly mutated in p53 are shown in yellow (from Cho et al., 1994).

So that was how P53 ‘talked’ to DNA to control the expression of specific genes. What could be better then, in a talk on how DNA damage can lead to cancer, than the story of a specific chemical doing nasty things to a gene that encodes perhaps the most revered of anti-cancer proteins?

The only thing baffling the students must have been the tobacco companies insisting, as they continued to do for years, that smoking was good for you.

And twenty-something years on …?

Well, it’s taken a couple of revolutions (scientific, of course!) but in that time we’ve advanced to being able to sequence genomes at a fantastic speed for next to nothing in terms of cost. In that period too more and more data have accumulated showing the pervasive influence of the weed. In particular that not only does it cause cancer in tissues directly exposed to cigarette smoke (lung, oesophagus, larynx, mouth and throat) but it also promotes cancers in places that never see inhaled smoke: kidney, bladder, liver, pancreas, stomach, cervix, colon, rectum and white blood cells (acute myeloid leukemia). However, up until now we’ve had very little idea of what, if anything, these effects have in common in terms of molecular damage.

Applying the power of modern sequencing, Ludmil Alexandrov of the Los Alamos National Lab, along with the Wellcome Trust Sanger Institute’s Michael Stratton and their colleagues have pieced together whole-genome sequences and exome sequences (those are just the DNA that encode proteins – about 1% of the total) of over 5,000 tumours. These covered 17 smoking-associated forms of cancer and permitted comparison of tobacco smokers with never-smokers.

Let’s hear it for consistent science!

The most obvious question then is do the latest results confirm the efforts of Denissenko & Co., now some 20 years old? The latest work found that smoking could increase the mutation load in the form of multiple, distinct ‘mutational signatures’, each contributing to different extents in different cancers. And indeed in lung and larynx tumours they found the guanine-to-thymine base-pair change that Denissenko et al had observed as the result of a specific chemical attaching to DNA.

For lung cancer they concluded that, all told, about 150 mutations accumulate in a given lung cell as a result of smoking a pack of cigarettes a day for a year.

Turning to tissues that are not directly exposed to smoke, things are a bit less clear. In liver and kidney cancers smokers have a bigger load of mutations than non-smokers (as in the lung). However, and somewhat surprisingly, in other smoking-associated cancer types there were no clear differences. And even odder, there was no difference in the methylation of DNA between smokers and non-smokers – that’s the chemical tags that can be added to DNA to tune the process of transforming the genetic code into proteins. Which was strange because we know that such ‘epigenetic’ changes can occur in response to external factors, e.g., diet.

What’s going on?

Not clear beyond the clear fact that tissues directly exposed to smoke accumulate cancer-driving mutations – and the longer the exposure the bigger the burden. For tissues that don’t see smoke its effect must be indirect. A possible way for this to happen would be for smoke to cause mild inflammation that in turn causes chemical signals to be released into the circulation that in turn affect how efficiently cells repair damage to their DNA.

raleighs_first_pipe_in_england-jpeg

Sir Walt showing off on his return                         to England

Whose fault it is anyway?

So tobacco-promoted cancers still retain some of their molecular mystery as well as presenting an appalling and globally growing problem. These days a popular pastime is to find someone else to blame for anything and everything – and in the case of smoking we all know who the front-runner is. But although Sir Walter Raleigh brought tobacco to Europe (in 1578), it had clearly been in use by American natives long before he turned up and, going in the opposite direction (à la Marco Polo), the Chinese had been at it since at least the early 1500s. To its credit, China had an anti-smoking movement by 1639, during the Ming Dynasty. One of their Emperors decreed that tobacco addicts be executed and the Qing Emperor Kangxi went a step further by beheading anyone who even possessed tobacco.

And paying the price

And paying the price

If you’re thinking maybe we should get a touch more Draconian in our anti-smoking measures, it’s worth pointing out that the Chinese model hasn’t worked out too well so far. China’s currently heading for three million cancer deaths annually. About 400,000 of these are from lung cancer and the smoking trends mean this figure will be 700,000 annual deaths by 2020. The global cancer map is a great way to keep up with the stats of both lung cancer and the rest – though it’s not for those of a nervous disposition!

References

Denissenko, M.F. et al. ( (1996). Preferential Formation of Benzo[a]pyrene Adducts at Lung Cancer Mutational Hotspots in P53.Science 274, 430–432.

Cho, Y. et al. (1994). Crystal Structure of a p53 Tumor Suppressor-DNA Complex: Understanding Tumorigenic Mutations. Science, 265, 346-355.

Alexandrov, L.D. et al. (2016). Mutational signatures associated with tobacco smoking in human cancer. Science 354, 618-622.