101 questions with a bioinformatician #29: Jane Loveland
This post is part of a series that interviews some notable bioinformaticians to get their views on various aspects of bioinformatics research. Hopefully these answers will prove useful to others in the field, especially to those who are just starting their bioinformatics careers.
Jane Loveland is a Senior Computer Biologist at The Wellcome Trust Sanger Institute where she is involved in a number of key projects relating to genome annotation and training.
As a manager in the HAVANA group (Human and Vertebrate Analysis and Annotation), she helps oversee the valuable work in using manual annotation to provide a reference gene set for the human, mouse, and zebrafish genomes. HAVANA's annotation is made publicly available via the Vega genome browser, which is in turn merged with the annotation in Ensembl to produce the reference GENCODE gene set.
Jane also leads a team of instructors for Wellcome Trust Advanced Courses which teach workshops all over the world, in particular the Open Door Workshops:
The Open Door Workshop provides an introduction to bioinformatics tools freely available on the internet, focussing primarily on the Human Genome data. The workshops provide hands-on training in the use of public databases and web-based sequence analysis tools, and are taught by experienced instructors.
And now, on to the 101 questions...
001. What's something that you enjoy about current bioinformatics research?
The speed of change. From an annotation view point, we are constantly having to find ways to use new data sources which in turn adds value to the annotation that we produce.
When I’m putting together a manual for a workshop I have to update everything, every time. I have come into bioinformatics from wet lab biochemistry/molecular biology and I once spent an entire week hand-crafting a multiple alignment figure for my thesis. I can do this in a few minutes now.
010. What's something that you don't enjoy about current bioinformatics research?
Everyone assumes that all genome sequences are 'finished' (KRB: I don't!). They may be sequenced but the quality is often pretty poor compared to the sequence that we were producing at the Sanger Institute about a decade ago.
You can’t interpret what’s going on in a genome if the underlying reference sequence is of poor quality. I do a lot of teaching and spend a lot of time explaining this to researchers.
011. If you could go back in time and visit yourself as a 18 year old, what single piece of advice would you give yourself to help your future bioinformatics career?
Just go for it. Bit of a cliché I know. I had a crippling lack of confidence when I was younger which I think really held me back.
100. What's your all-time favorite piece of bioinformatics software, and why?
For annotation: Blixem. This is an interactive graphical BLAST viewer — old but essential for gene annotation. Means that I can view alignments to the genome at base pair level really quickly and simply.
For workshops: Ensembl. You have to be able to browse a genome.
101. IUPAC describes a set of 18 single-character nucleotide codes that can represent a DNA base: which one best reflects your personality, and why?
Can I have I for inosine? Reminds me of making degenerate primers for PCR. It's a multi-tasker, which is also how I see myself. It's not on the list though (KRB: everyone keeps breaking the rules!).