Categories


Keep up to date

Search

Links


Archive


BioMed Central Blog

Tuesday May 11, 2010

On its 10th anniversary Ensembl publishes a thematic series with the BMC-series

The last 15 years has seen an explosion in genomic research and sequenced genomes. With the build up to sequencing larger chordate genomes it became very clear that manually annotating the billion base pairs of sequence produced was not practical and automated annotation systems were required. Several large organisations have helped address this issue, but the Ensembl project, a joint venture between the European Bioinformatics Institute and the Wellcome Trust Sanger Institute, has in particular provided high-quality integrated annotation on vertebrate genomes within a consistent and open source infrastructure. This year marks the 10th anniversary of the Ensembl project’s launch, and BioMed Central is today publishing a thematic series of articles describing the construction, content and current use of Ensembl's resources.

The first six articles published today in BMC Bioinformatics and BMC Genomics, co-ordinated by Paul Flicek at Ensembl and the European Bioinformatics Institute, reveal in detail how many of the comparative genomics, variation and regulatory data resources have been constructed. The first article
describes the comprehensive web-based functions available for tabulating and visualizing genome variants. A second related article discusses the database and software library supporting the integration of variation data into the existing Ensembl resources.

To be able to keep up with the ever increasing number of genomes reported (51 in the last release), Ensembl has had to use automated workflow systems.
Jessica Severin and colleagues present an artificial intelligence pipeline ‘eHIVE’, based on a self-organizing workflow system akin to the behavior of honey bees, to provide updates to its comparative genomics resources. Benoît Ballester and colleagues also demonstrate how the Ensembl microarray annotation protocol handles the release of the latest commercial arrays. To keep a pace with both the increasing demands of users and the terabytes of data now available from the website, Anne Parker and colleagues show how they use caching and optimization techniques alongside Web 2.0 technologies to improve the performance of the Ensembl website.

A final article by Giulietta Spudich and Xosé Fernández-Suárez uses several examples to offer a practical guide for using Ensembl to learn about genomic annotations in regions of interest.


While most of this detailed “behind the scenes” information will not significantly alter the way users access genomic data, it guides molecular biologists to the full range of tools available to them. It will also be of great value to researchers building other bioinformatics applications, and demonstrates how Ensembl is constantly adapting and updating their tools to be able to prepare for its next decade.

Scott Edmunds

Senior Scientific Editor, BMC Series journals


 

Post a Comment:
  • HTML Syntax: Allowed