11th International Conference on Systems Biology, Edinburgh 2010

I've just returned from the 2010 International Systems Biology Conference in Edinburgh. It was an academic meeting, consisting of 4 full days of scientific presentations, many in parallel. It was attended by about 1000 scientists from around the world. Some of the talks were live-blogged here.

The definition of systems biology is a bit nebulous and controversial, but I have some definite ideas on this subject. Here is the definitive definition:

Systems biology is scientific research whose goal is an holistic, mechanistic understanding of biological systems. This goal should be achieved through multi-disciplinary research using an iterative cycle of experimental and theoretical research. The most natural framework for such research is the development of mechanistic, mathematical, dynamic simulation models and the testing of scientific hypotheses embedded in those models by rigourous comparison with (preferably a lot of) data.

My view of systems biology is strongly influenced by my background in systems engineering, which is a more developed and clearly defined area of scientific endeavour. Systems engineering shares many common features with systems biology, it's typically interdisciplinary (largely because engineers are mathematicians, chemists, biologists, accountants, physicists, computer scientists all in one person), it involves simulation modelling, it involves iteration between predictive modelling (for design initially), experimentation (the building, operation and testing of machines or chemical plants), model refinement based on the capture of data specifically relevant to the model and model use in predictive mode (process optimisation). There are two major qualitative differences between systems engineering and systems biology. 1) Physicochemical observations required for parameterising and validating engineering models are much easier to make or estimate than equivalent biological observations. 2) Engineering models are built with very specific goals in mind (e.g. to maximise speed, production rate, performance or safety of a machine or plant) which can all be reduced to attempts to make as much money as possible.

The first difference makes systems biology more challenging and interesting than systems engineering. However the second difference is the most striking and informative, particularly in light of the current state of systems biology as presented at the ICSB meeting. This current state is a bit different to my ideal of how systems biology should work as outlined above. During the meeting I was struck by several examples of people gathering data and performing analysis which had no underlying mechanistic modelling framework. Eric Werner seemed to want complete virtual organisms, capable of exact reproduction of living systems, but I think that this approach would be very inefficient, and that useful models are not complete, but rather focussed on a particular biological problem or question. There's nothing wrong with high-throughput hypothesis-free research, I do some of this kind of work myself, however it's not systems biology. An interesting example of this was Mike Tyers' complaint that the Tyson-Novak model of the cell cycle cannot explain the variation observed by Costanzo et al. in their 2010 Science paper. The comparison is simply unfair. The Tyson-Novak model is elegant, relatively simple and focussed on a specific aspect of yeast biology: the cell cycle. It increases our understanding by highlighting the important aspects of a complex system that make it work the way it does. The Costanzo data is a high-throughput dataset, captured in a completely hypothesis-free fashion and is affected by all aspects of yeast biology.

I have been interested in systems biology and attending systems biology conferences for about 10 years now, and I think that as time goes on the systems biology community is straying from mechanism- and model-based research towards high-throughput data capture and analysis. This kind of work can be exciting too, but I think that it's unhelpful to classify it as systems biology. High-throughput data analysis can provide an excellent starting point for the systems biology cycle but I would prefer it was classified as genomics and bioinformatics. I think that the meeting would have been better had the organisers selected more ruthlessly for mechanistic modelling talks. It would have been a shorter and smaller meeting and perhaps even not as high profile, but I think it could have been more useful and exciting.

Despite my negative rantings, there were many excellent talks that I really enjoyed only a few of which I had seen before. In no particular order these included:

Michael Stumpf's talk on using Approximate Bayesian Computing to infer parameter distributions (and model robustness) for a model describing the movement of macrophages towards wound sites in zebra fish. Testing whether macrophage walks are random. Observed constrained "drunken" walk, constrained in a line, but randomly moving forwards or backwards. Different amounts of information for different model parameters, some parameters are "sloppy". Signal transduction detail largely irrelevant.

Thomas Pollard's very pro-dynamic-modelling, pro-mechanism talk about actin motility. "You can't understand complicated biology without mathematical models". "Passionate about using time as a variable in experiments". "Without quantitative rate parameter measurements you are lost". Described careful calibration of GFP measurements to protein number. Estimated reaction and diffusion rates. Demonstrated that assuming mass-action kinetics in a well-stirred vessel not appropriate sometimes, need a spatial dimension. Described careful selection of GFP tag to ensure no effect on growth rates.

Luis Serrano's talk presenting the huge amount of diverse data captured describing one small bacterium. Spent a lot of time demonstrating that mRNA and protein levels are not correlated, but this is well known at this stage.

Michaela de Clare and Annette Alcasabas presented focussed combination of modelling and experimental work on a switchable tetraploid yeast system where each of 4 copies of relevant genes can be switched on and off at will, giving haploid, diploid, triploid and tetraploid yeast, a sort of more controlled method of overexpression, together with modelling work predicting mutant growth rates and comparisons of predicted and observed rates.

Douglas Murray presented a continuous yeast culture system, allowing the culture to remain out of stationary phase, but for cell concentrations to remain constant. These cells begin to oscillate in oxygen uptake, growth and transcription with a period of about 40 mins. He uses his observations to validate the Manchester yeast metabolism model.

Franziska Matthaeus described work on the tumble-run decision in bacteria. Exponentially distributed run lengths gives a random walk but power-law distributed run lengths can give a Levy walk which is qualitatively different and explores space more rapidly. Can construct power-law distribution from the sum of exponential distributions with different parameters. Showed that Gaussian parameter distribution gives power-law run-length distribution therefore better space-coverage, therefore example of noise giving selective advantage. Showed great 2D simulations of bacteria motion in response to nutrient gradient (random walk slower but more adaptive and precise than Levy walk) and discussed mechanisms underlying run-length distributions.

Darren Wilkinson presented work demonstrating the importance of modelling the measurement process when using GFP tags to infer stochastic model parameters. Measuring protein levels directly is best, but not practical. Modelling connection between GFP observations is the way to do things. The alternative (common) way to proceed is to use GFP as a surrogate for protein, with perhaps a fixed proportionality confusing the two. This gives poor results since you are lumping measurement error and intrinsic model stochasticity together, when they are in fact independent. The relation between GFP and protein levels was a pretty major theme throughout the meeting.