When will ‘open science’ become simply ‘science’?

A great commentary piece by Mick Watson (@BioMickWatson) in Genome Biology where he discusses the six O's (Open data, Open access, Open methodology, Open source, Open peer review, and Open education). On the former:

…it is no longer acceptable for scientists to hold on to data until they have extracted every last possible publication from it. The data do not belong to the scientist, they belong to the funder (quite often the taxpayer). Datasets should be freely available to those who funded them. Scientists who hoard data, far from pushing back the boundaries of human knowledge, instead act as barriers to discovery.

Amen.

101 questions with a bioinformatician #26: Kerstin Howe

Kerstin Howe is a Senior Scientific Manager, leading the Genome Reference Informatics group at the Wellcome Trust Sanger Institute. As part of the Genome Reference Consortium (GRC), Kerstin’s group is helping ensure that “the human, mouse and zebrafish reference assemblies are biologically relevant by closing gaps, fixing errors and representing complex variation”.

This important work entails the generation of ‘long range’ information (sequencing and optical mapping) for a variety of genomes and using that information to provide genome analyses, visualise assembly evaluations, and curate assemblies. You may also wish to check out 101 questions interviewee #3 (Deanna Church), another key player in the GRC.

Kerstin is not my first ‘101 Questions’ interviewee that I know from my time working on the WormBase project. Unlike interviewee #23 though, I did have the pleasure of sharing an office with Kerstin — or WBPerson3103 as she will forever be known in WormBase circles — during my time at the Sanger Institute. It was after leaving WormBase that she became a big fish (of a little fish) in the vertebrate genomics community. And now, on to the 101 questions…

Read More

How do we assess the value of biological data repositories?

For the most part, model organism databases such as WormBase, FlyBase, SGD etc. offer an invaluable and indispensible resource to their respective communities. These sites exist due to funding agencies recognizing their essential nature, and are typically funded on five-year grants.

There are many other bioinformatics resources out there, most of which have probably been useful to some people at some time or other. But how should we decide what is useful enough to merit continued funding? It's a very tricky question, and is one which Todd Harris is starting to explore on his blog:

In an era of constrained funding and shifting focus, how do we effectively measure the value of online biological data repositories? Which should receive sustained funding? Which should be folded into other resources? What efficiencies can be gained over how these repositories are currently operated?

Worth a read. I've written before about the issue of whether there are too many biological databases, especially when very multiple databases emerge with heavily overlapping areas of interest. I think funding agencies and journals should think carefully before supporting/publishing new resources without first really establishing:

  1. Is this really needed?
  2. Does it overlap with other existing resources?
  3. What will happen to the resource should funding and/or personnel not be available to keep it going?