GigaBlog

GigaScience, Giga-database and now GigaBlog: new resources for the big-data community
As biological data is
now produced faster than it can easily be handled and stored, the
dissemination of this data has become a major bottleneck. GigaScience: a new type of journal from BioMed Central and BGI
— no stranger to these issues being the world’s largest Genomics center
— starts taking submissions today with the goal of addressing many of
the issues surrounding “big-data”. Much of the rationale and features of
the GigaScience journal and its associated database is presented on our website.
But with a scope that covers any biological and biomedical
“large-scale” data (and the“(Giga)n” refers to gigantic rather than a
specific number), one important question is how exactly are we defining
“large-scale”? The answer unfortunately is: it depends.
What
makes something big-data varies greatly from field-to-field, and also
changes rapidly with technological developments; so this is a question
we will be regularly asking our editorial board and scientists in
different research communities. But, to keep our readers and authors
updated, rather than constantly changing this information in our instructions for authors,
we feel a blog makes a better forum for this type of open-ended
discussion. We also hope to hear from you as to your thoughts on what
constitutes "big" data, especially for those areas that are not
generally thought of as having large-scale data resources — like
cellular development with a myriad of imaging data types, neuroscience
and electrophysiology, and cohort studies with metadata that has many
permissions issues needing to be discussed and solved.
Launching our first post here, and as a guest on the BMC blog,
we’d like to welcome you and hope our future blog discussions will
supplement and enhance the content of the journal. Upcoming postings
will provide updates on the progress of the journal up to its formal
launch in November, introduce the editors and editorial board, report on
conferences, and provide news on the many current issues surrounding
the handling and use of large-scale data and high-throughput biology.
The blog will also highlight interesting datasets deposited in our
database and new types of large-data from different, potentially
unexpected, biological fields.
As part of our prelaunch activities, GigaScience has just released its first datasets
that are marked with a citable DOI and have no restrictions on use.
These datasets include the sequence and assembly data from the recent
deadly outbreak strain E. coli O104
from BGI and the University Medical Centre Hamburg-Eppendorf, as well
as 7 large vertebrates sequenced for the Genome10K project, a worldwide
collaborative effort to sequence 10,000 vertebrate genomes. These data
include the Giant Panda, the Chinese Rhesus and Crab-Eating Cynomolous Macaques, the Polar Bear, the Emperor and Adelie Penguins, and the Domestic Pigeon. The usefulness of this novel method of rapid data release —prior to manuscript publication— is exemplified by the recent release of the E. coli O104 data as it was being created; this resulted in immediate “crowd-sourcing” of the data by the research community and has already aided the fight against this deadly outbreak.
We
want to give a special thanks to the international group of researchers
who took this important step toward finding the best means to balance
the needs of the larger community to gain access to the data with that
of obtaining credit for their work. Additionally, we would like to thank
BGI and BMC for their support and help in setting up this venture. We’d
like to give our appreciation to Datacite and the British Library for working to provide DOIs for our associated datasets, and to ISA-Tab
for helping with standardization of our data-submission system to make
it more adaptable, standardized, and ISA-tab compliant. We’d also like
to thank our growing editorial board for their (present and future)
support.
We are excited about this new endeavor and are looking
forward to working with the entire community to speed research, push
open access, and aid in making these important resources permanently
available for use and reuse.
Laurie Goodman, Editor-in-Chief
Scott Edmunds, Editor
Alexandra Basford, Assistant Editor
Follow @GigaScience on Twitter
Posted at 05:19PM Jul 06, 2011 by Gabriella Anderson in General | Comments[0]