Toward a comprehensive language for biological systems
- 9.9k Downloads
Rule-based modeling has become a powerful approach for modeling intracellular networks, which are characterized by rich molecular diversity. Truly comprehensive models of cell behavior, however, must address spatial complexity at both the intracellular level and at the level of interacting populations of cells, and will require richer modeling languages and tools. A recent paper in BMC Systems Biology represents a signifcant step toward the development of a unified modeling language and software platform for the development of multi-level, multiscale biological models.
See research article: http://www.biomedcentral.com/1752-0509/5/166
KeywordsUnify Modeling Language System Biology Markup Language Stochastic Simulation Algorithm Reaction Rule Maturation Promote Factor
Modeling for biologists?
In his essay 'Can a biologist fix a radio?' molecular biologist Yuri Lazebnik highlighted the absurdity of some kinds of informal reasoning that pervade biology, and called for the development of biologist-accessible (if not exactly friendly) languages to promote more formal approaches to reasoning about and prediction of the behavior of molecular networks inside cells . Although he suggested that the rise of systems biology might force biologists to change quickly, it is still a safe bet nearly a decade later that most experimental biologists are unlikely to be familiar with modeling and related software tools, let alone using them. This is despite the rapid rise of genomics and bioinformatics that has made the use of bioinformatics tools, such as BLAST, an essential part of training and practice.
Because of the development of general-purpose rule-based languages and simulators, it is now possible to construct biochemical models of an arbitrary number of network components at a high level of resolution and to simulate the model in a reasonable amount of time on a desktop computer [3, 7, 8]. Many challenges remain, not the least of which are making the tools more accessible to bench biologists and, perhaps more important, fostering a culture in which modeling is more commonly used as a reasoning aid. In the near future I envision that biologists will be able to construct models using tools very similar to those that are used to search the literature and online knowledge bases, and they will be able to use these models to predict the outcome of possible experiments and to gain insight into the possible mechanism through which the predicted effect may arise. Even researchers with limited mathematical or computational experience should be able to engage fully in the productive cycle of experimentation followed by modeling followed by further experimentation.
To summarize up to this point, rule-based modeling now provides a scalable way to model the complex molecular biochemistry that is employed by cells to process information. Incorporating such models into everyday study of signaling systems could have a profound impact on molecular biology. So far, however, I have considered only what goes on inside cells when they are treated as well-mixed chemical bags, and not their internal organization or how they interact with each other, which of course is fundamental to biology. Furthermore, a fundamental challenge in biology, to understand the genetic basis of phenotype, requires coupling predictive models of intracellular biochemistry with models of higher levels of organization - cells, tissues, organs, and so on - in a bi-directional way. Since its inception, however, systems biology has been more oriented toward the molecular, intracellular level, which is reflected in the fact that most of the modeling tools that have been developed are aimed at the development of chemical network models and do not provide capabilities for constructing models that span multiple levels of resolution. For example, standardized exchange formats for systems biology models, such as Systems Biology Markup Language (SBML) , do not readily support such embedding.
Both forms of causation may be concisely represented in ML-Rules (Figure 2b, c), allowing for the specification of multi-level models. For example, a population of interacting cells may be modeled as a collection of cells, each of which contains a collection of molecules that interact via globally defined rules. The movement, growth, and division of cells may be defined by rules that act at the cell level, whereas binding, uptake, and secretion of molecules may be defined by rules that span the cell and molecular levels. The description of the intracellular level could be further refined by inclusion of such processes as endocytosis and nuclear import/export, which would also require additional levels of representation for endosomes, nucleus, and so on.
ML-Rules is the first fully implemented rule-based modeling language that has been described in the literature and is capable of integrating detailed molecular biochemistry into multi-level models. The hierarchical representation used in ML-Rules is related to a more general formulation called reactive bigraphs, which also uses a nested object hierarchy and reaction rules to represent the interactions that can take place in a complex network . Several biological languages based on reactive bigraphs have been proposed (for example, ), but software implementations have so far not been presented.
There are, however, other general-purpose tools available for the integration of rule-based biochemistry, as described above, into multi-level models (for example, [13, 14]). These tools use different mathematical and computational models to describe the dynamics at each level of the system, and can in this sense be termed 'heterogeneous'. Of these, the most accessible for a general audience is probably the Simmune platform, which has a graphical interface that integrates all stages of modeling from model construction to data analysis and allows embedding of rule-based biochemical descriptions into cellular agents . There are also other general purpose tools for multi-level modeling; an example is CompuCell3D , which allows reaction networks described in SBML to be embedded in cellular simulations of varying sophistication but cannot yet handle rule-based specifications of the biochemistry.
The cell cycle example presented by Maus et al. could probably be implemented in each of the heterogeneous tools mentioned, as well as others. Each of these implementations, however, would likely be more difficult to understand and less flexible than the corresponding ML-Rules implementation because of the lack of a unifying language and adherence to a pre-defined level hierarchy. In most of the current frameworks, models are specified in the form of plain-text files and/or high-level program code in languages such as Python and C++. The embedding of levels is either fixed or achieved through calls to specific functions in a programming library. ML-Rules, on the other hand, provides a formal biological language for expressing all parts of the model. The number of levels and the physical model for simulating each level can be achieved by refactoring the rules.
The flexibility of ML-Rules does come with a cost, however. Describing higher-level processes such as cell division with rules requires some sophistication on the part of the modeler; it is not simply a matter of translating knowledge about a specific molecular interaction into a rule. Such barriers could be overcome by defining rule templates that a modeler can use for specific types of behavior and creating libraries, but it remains to be seen whether the heterogeneous approaches mentioned above or the unified approach taken by ML-Rules provide a better basis for the development of intuitive modeling tools for the biologist. Simulation efficiency is also an issue that needs to be addressed before more realistic applications are possible. The stochastic simulation algorithm implemented in the current version of ML-Rules is limited to relatively small populations of cells. Although no direct performance comparisons have been carried out, heterogeneous simulators, which usually have highly optimized simulators, are probably capable of performing much larger-scale simulations on the same system.
In search of the Killer App
What is needed for dynamical systems modeling of the type enabled by tools discussed here to take off among experimental biologists? Lowering the barrier to using tools and to using existing knowledge to create models is clearly a key requirement. At the level of molecular biochemistry, rule-based modeling represents a key conceptual advance, although much work needs to be done to make it more broadly accessible. Languages for describing multi-level models are going to take more work and time because of the inherent complexity of the challenge, in terms of both representation and simulation. Finding the right balance of flexiblity and simplicity is difficult.
What is probably more critical for wider adoption, however, is the demonstration that these types of models can lead to new discoveries that could not otherwise be made - a 'Killer App'. It could take the form of a model that a large community of biologists adopts for the study of a specific system - for example, yeast pheremone signaling, cell cyle, or bacterial chemotaxis. Such an example could be instrumental in convincing biologists to make rule-based modeling part of their standard toolkit for fixing radios.
I acknowledge support from NIH grant 5UL1RR024153-05 and NSF grant CCF 0829788 and an NSF Expeditions in Computing grant (award ID 0926181). Thanks to Bill Hlavacek, Leonard Harris, Justin Hogg, John Sekar, Michael Sneddon and Carsten Maus for constructive comments on the manuscript.
- 9.Hucka M, Finney A, Sauro HM, Bolouri H, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A, Cuellar AA, Dronov S, Gilles ED, Ginkel M, Gor V, Goryanin II, Hedley WJ, Hodgman TC, Hofmeyr JH, Hunter PJ, Juty NS, Kasberger JL, Kremling A, Kummer U, Le Novère N, Loew LM, Lucio D, Mendes P, Minch E, Mjolsness ED, et al: The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003, 19: 524-531. 10.1093/bioinformatics/btg015.CrossRefPubMedGoogle Scholar
- 12.Damgaard TC, Danos V, Krivine J: A language for the cell. Technical Report TR-2008-116. 2008, IT University of CopenhagenGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.