We already have several hundred models of metabolic networks in various species, and dozens of models of gene regulatory networks in different organisms and cell types. However, the existing metabolic models focus mainly on stoichiometry, while the gene regulatory models are often static and lack quantitative features. Although there are only two published Whole Cell Models, labs worldwide are actively working on developing more models.

Here’s the catch: the existing models have been constructed by piecing together different signalling and gene regulatory models without a coherent and integrated framework. Traditional modelling approaches or manual model development simply cannot deliver the types of models we need. We’re talking about models with thousands of equations and extremely high-dimensional parameter spaces. These sophisticated models are essential for applying engineering design principles in biotechnology and synthetic biology.

Our goal? To build a pipeline that combines information and text mining, systematic knowledge representation, automated exploration of model spaces, statistical model calibration, and model visualization, curation, and dissemination, to efficiently automate the generation of large-scale computer models of biological cells.