European initiatives improve the dissemination of bioinformatics

BioModels, the world's first database of annotated biological models, is the result of a collaborative project led by the European Bioinformatics Institute and the SBML Team, an international group that develops opensource standards to describe biological systems.

Even the simplest living organisms perform a mindboggling array of different processes that are interconnected in complex ways to ensure that the organism responds appropriately to its environment.

One of the best ways of ensuring that we really understand how these processes fit together is to build computer models of them. If a computer model behaves differently than the real organism, we know that an important component of the system has been neglected.

Quantitative models can also reveal previously unappreciated properties of complex systems, paving the way towards new drug treatments. This approach, known as computational systems biology, is becoming increasingly popular now that scientists are accumulating detailed parts lists for many organisms, thanks to genome sequencing projects and other efforts to comprehensively document the components of living entities.

“Until now, computer modellers had no defined way of exchanging descriptions of biological systems, and there was no accepted place to deposit and share new models when they were developed,“ explained the EBI's Nicolas Le Novère. “The BioModels database aims to address these issues.“

The first step was to develop a standard way of describing such models. The Systems Biology Markup Language (SBML), an open-source computer language developed by the SBML Team, is now widely accepted and is supported by over 75 different software systems worldwide.

This allows computational systems biologists to write models using the tool of their choice, and then to share them so that others can build on their work.

Michael Hucka of the California Institute of Technology continues: “The next logical step was to build a community resource that would allow anyone to submit, download and reuse the models. That's the purpose of the BioModels database. BioModels provides access to published, peer reviewed, quantitative models of biochemical and cellbiological systems.“

Some of these systems are very simple, containing just a few processes or reactions; others contain hundreds. The models are checked to verify that they correspond to the reference publication. Human curators annotate and crosslink components of the models to other relevant data resources. This allows users to identify precisely the components of models, and helps them to retrieve appropriate models, which they can then visualise and build upon using any

SBML-compatible software.

“Ultimately,“ said Le Novère, “we hope that publishers will encourage any author who plans to publish a new model to submit it to the BioModels database; this will ensure that all the models in the public domain are freely available for everyone to make the most of them.“

The BioModels database is freely available at www.ebi.ac.uk/biomodels.

The EBI is part of the European Molecular Biology Laboratory (EMBL) and is located on the Wellcome Trust Genome Campus in Hinxton near Cambridge in England. It hosts some of the world's most important collections of biological data, including DNA sequences (EMBL-Bank), protein sequences (UniProt), animal genomes (Ensembl), three-dimensional structures (the Macromolecular Structure Database) and data from microarray experiments.

Other contributors to this project include the Keck Graduate Institute (USA), the Systems Biology Institute (Japan) and Stellenbosch University (South Africa).

E8m for informatics

Meanwhile, the Commission of the European Union has awarded E8.3million to a pan-European task force who will improve access to biological information for scientists throughout and beyond Europe.

The EMBRACE Network of Excellence, which encompasses computational biologists from 17 institutes in 11 countries and is coordinated by the European Bioinformatics Institute's associate director Graham Cameron, will use these funds to simplify and standardise the way in which biological information is served to the researchers who use it.

Scientists now depend on databases to access the avalanche of information that they produce. For example, geneticists are trawling through the human genome for genes that are involved in diseases.

Data providers put a huge amount of effort into providing data resources that are comprehensive, user-friendly and cross-linked to other databases; but different data providers use different methods. This means that a researcher might have to search 10 or more different databases to find all the information pertaining to a particular set of candidate genes. If they are doing these kinds of searches on a regular basis, they will want their own local copies of the databases. Maintaining up-to-date and fully functioning versions of all those databases and the tools to search them is a huge and complex task.

Vincent Breton of Clermont-Ferrand based CNRS in France is a member of EMBRACE's executive board. He describes the problem as analogous to the use of electrical items before the electrical grid. “You didn't know whether your gadget's plug would fit the socket,“ he said.

EMBRACE intends to turn the relationship between user and provider on its head by enabling data providers to provide well-defined interfaces to their databases that will conform to the same standards, essentially creating a data grid ­ the EMBRACEgrid ­ that will allow users to make the most of dispersed data resources.

To ensure that EMBRACE's efforts are immediately useful to biologists, Europe's most heavily used biomolecular databases and tools will be integrated into the EMBRACEgrid.

A technology watch will ensure that the EMBRACEgrid does not become locked into technology that is quickly superseded. The grid will also receive regular workouts using test problems, such as identifying candidate genes for a disease or linking viral mutations to their ability to cause disease.

Disseminating information about the EMBRACEgrid will be vital to ensure that scientists throughout Europe not only use the new technology, but also help to expand its capabilities by grid enabling their own data resources.

“Many elegant and powerful computational biology tools are under-utilised,“ said EMBRACE executive board member Erik Bongcam-Rudloff from the University of Uppsala in Sweden. “EMBRACE will allow us to unlock their potential by standardising access to them.“

Recent Issues