This article is more than 1 year old

Beyond the genome: YOU'VE BEEN DECODED, again

Welcome to the world of the proteome

The current science is a long way from that seen in Sergei Lukyanenko’s The Genome, but still quite amazing

Spinning with the spectrometer

This is a large blue and grey box, about £250,000 worth of mass spectrometer. And look, there are lots of them, humming away, all weighing very small things. So, whilst the HGP sequenced one set of 25,000 genes to start to document the proteome there is much more work to do. You can start with a known cell line (which is a group of identical cells, all derived from a single, original cell).

For that cell line you can measure the presence (and quantity) of about 17,000 proteins. Then you can look at how these vary over the life cycle of the cell; if you split the life cycle up into 10 stages, than you have to do the entire set of measurements 10 times. Then you can start varying parameters such as acidity/alkalinity, glucose level and so on, and see how these affect the mix.

Then you move onto the next cell line and start again. So the HPP is like the HGP on steroids (which are technically lipids but you get the idea; the HPP is much, much more complex).

The Wet process - measuring the proteins in cells

You take some cells and mash them up and extract the proteins. Then you use an enzyme to chop those long protein molecules into much shorter segments called peptides.

Then you put these peptides into a mass spectrometer and record the mass and electrical charge of all the peptides. What comes out is a set of XY coordinates.

It is the position on the X axis that identifies the peptide and the area under the curve that gives the amount. In practice the curves are essentially three dimensional so we calculate the volume not the area.

For the data freaks amongst us, calculating the volume under the curve looks just like a big-data problem, which it is and that is how I got involved. All of the difficult, wet stuff is done by life scientists, such as Professor Angus Lamond and his team. Lamond is head of the Laboratory for Quantitative Proteomics at Dundee University.

Lamond and I work at Dundee University; he and I were introduced in 2009 because he was generating lots of data and was interested in processing it better and faster. There are many groups all working on the HPP all around the world.

Whilst the HGP sequenced one set of 25,000 genes to start to document the proteome, there is much more work to do

Once you have identified all the peaks (and measured the volumes) you can back extrapolate and say: “If I see these peptides, then I must have had this protein in the cell.”

The problem here is that peptides are tiny fragments of the original proteins: think of the proteins as books and the peptides as sentences. So the human proteome is a library of 25,000 books. Someone has selected 17,000 of them and then chopped them all up into sentences. Our job is to identify which 17,000 books were selected but all we get is a huge collection of sentences. If you find the sentence “It was a lovely day", that sentence could quite possibly have come from a number of different books.

Next page: Peptides in the mix

More about

TIP US OFF

Send us news


Other stories you might like