Is there an important difference between bioinformatics and computational biology? →
Richard Edwards gives us a thoughtful, expressive, and rather eloquent take on this very important topic.
Richard Edwards gives us a thoughtful, expressive, and rather eloquent take on this very important topic.
This post is part of a series that interviews some notable bioinformaticians to get their views on various aspects of bioinformatics research. Hopefully these answers will prove useful to others in the field, especially to those who are just starting their bioinformatics careers.
Sarah Teichmann is a Group Leader at the European Bioinformatics Institute and a Senior Group Leader at the Wellcome Trust Sanger Institute — the Genome Campus (at Hinxton, UK) is one of those strange places where you can walk 10 meters and become a different (and more senior) person!
Her research focuses on elucidating the principles of protein structure evolution, higher order protein structure and protein folding. She also has a longstanding interest in understanding gene expression regulation. As part of her work, she is involved with developing and maintaining a number of useful bioinformatics resources including the 3D Complex database.
Sarah was a recent recipient of the the prestigious European Molecular Biology Organization (EMBO) Gold Award for her use of 'computational and experimental methods to better understand genomes, proteomes and evolution'. She was also recently interviewed by CrossTalk (the blog of Cell Press): The Unstoppable Sarah Teichmann on Programing, Motherhood, and Protein Complex Assembly. I particularly liked Sarah's general advice to junior scientists:
Follow your heart and work on things you are excited about and enjoy. Life is too short—and academic careers too unpredictable—to settle for anything less. Try to work with people who are reasonable and considerate of others, yet driven and focused, and generous in investing time and resource to projects and careers of lab members and colleagues.
You can find out more about Sarah by visiting her group's website. And now, on to the 101 questions...
001. What's something that you enjoy about current bioinformatics research?
The data deluge! So much and so many kinds of biological data — ranging from all the versions of next-generation sequencing data to protein structures — it is such a gift. As computational biologists, we are in an unprecedented position to make new discoveries by mining this data, and we’re all having a ball.
010. What's something that you don't enjoy about current bioinformatics research?
I’m thinking hard to come up with something. One issue that has always puzzled me is why mainstream journals don’t recognise the value of pure theoretical and computational biology. The prediction of the structure of the double helix was recognised with a Nobel Prize, and celebrated more than the Franklin/Wilkins crystal structure. Predictions are generally given scant notice, and the experimental validation (often years later) is considered the key achievement. This strikes me as incongruous.
011. If you could go back in time and visit yourself as a 18 year old, what single piece of advice would you give yourself to help your future bioinformatics career?
Take programming and computer science seriously, and get some formal training in it.
100. What's your all-time favorite piece of bioinformatics software, and why?
R came after my time as a hands-on researcher (I’m more of a 90s Perl girl) but it seems to have revolutionised how quickly people can implement methods and visualise data. I also like the fact that there are now notebook-style ways of documenting whole workflows in R and Python. This can be included as supplementary material in publications and should help in making analyses easily reproducible by others.
101. IUPAC describes a set of 18 single-character nucleotide codes that can represent a DNA base: which one best reflects your personality, and why?
Please can I choose three? A then U then G codes for "go bioinformatics" ☺
I came across this disturbing image on the web today. Warning, may cause offense:
Click to enlarge
Even more disturbing was the text that accompanied the image, text that appears on Microsoft Research's flickr account (emphasis mine):
The Microsoft Biology Initiative includes several Microsoft biology tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries. One such tool, the Microsoft Research Biology Extension for Excel, displays the contents of a FASTA file containing an Influenza A virus sequence. By importing FASTA data into Excel, researchers are better able to visualize and analyze information.
The point at which you want to import FASTA files into Excel is the point at which you should probably think about quitting bioinformatics.
This post is part of a series that interviews some notable bioinformaticians to get their views on various aspects of bioinformatics research. Hopefully these answers will prove useful to others in the field, especially to those who are just starting their bioinformatics careers.
Aaron Quinlan is an Associate Professor of Human Genetics and Biomedical Informatics at the University of Utah and the Associate Director of the USTAR Center for Genetic Discovery.
His research focuses on "developing and applying computational methods towards the understanding of genetic variation in diverse contexts". This work had led to Aaron's involvement in the development of many popular bioinformatics tools, with Bedtools being one of the most well known. I wish he had time to blog more, because then we could all enjoy more writing like this:
Have you ever been incensed by the ridiculous number of chromosome naming and ordering schemes that exist in genomics? If the answer is “no”, then either you are an incredibly patient person, you enjoy unnecessary chaos, or you just haven’t done any detailed analysis of genomics datasets.
You can find out more about Aaron by visiting his lab's website, or by following him on twitter (@aaronquinlan). And now, on to the 101 questions...
001. What's something that you enjoy about current bioinformatics research?
I come from a creative family and have always enjoyed building things. There is pure joy in having the power to conceive and apply an algorithmic idea that has the potential to improve our understanding of the biology of the genome and the genetic basis of disease.
010. What's something that you don't enjoy about current bioinformatics research?
011. If you could go back in time and visit yourself as a 18 year old, what single piece of advice would you give yourself to help your future bioinformatics career?
Take every math and statistics course possible and read constantly while you still have the time.
100. What's your all-time favorite piece of bioinformatics software, and why?
Without question, PolyBayes (Marth et al, 1999). I came to computational biology as a former software engineer without substantial training in biology. PolyBayes was the first Bayesian method for polymorphism detection and was written by my Ph.D. mentor, Gabor Marth. I spent much of my first year in graduate school dissecting the PolyBayes code (and the ACE file format)!!!) to understand the mathematic and data analysis strategies that were required at the time. That learning process has influenced much of the work I have done since.
101. IUPAC describes a set of 18 single-character nucleotide codes that can represent a DNA base: which one best reflects your personality, and why?
N, since I constantly feel as though I am doing everything while also doing nothing.
DVD bonus materials
KRB: Because of the relative brevity of this interview, I thought that I would also share a couple of answers that Aaron gave me to some of the questions I also include when asking people to do these interviews (this info sometimes helps me write my introductions):
0111. What is the correct way of describing your current position or title(s)
1001. In 1–2 sentences, describe what your role entails
Basically doing everything I can to not be a bottleneck for the people in my lab.