Ewan Birney's EBI press conference on being elected to the Royal Society

Speaker: And that concludes this EBI press conference to congratulate Ewan Birney on being elected to the Royal Society. We just have time for one or two questions. Ah okay...the first question goes to…Ewan Birney.

Ewan: Hi Ewan. Just wanted to say that this is all great and I've found your work to be really interesting. Can I just ask whether you've looked at the opportunity of widening this effort by joining other Royal Societies as well? This would allow for a much better comparative analysis of the scope and impact of Royal Society members? The Royal Statistical Society may be a good choice to begin with, or maybe the Royal Society of Marine Artists.

Ewan: Thanks Ewan, that's a really good question. It is something that I'm considering and I think there is a lot to gain from such a comparative approach. But to do this properly I think it needs to be part of a much larger effort. So I'm hopeful of trying to join every Royal Society and then see what can be learned from a cross-societal analysis of such memberships. Furthermore I'm hopeful that Her Majesty could be persuaded to start a new Royal Society for the Promotion of Questions by People Named Ewan at Academic Conferences…something that is very near and dear to my heart.

Speaker: Okay, I think we have time for just one more question. Oh, Ewan…again.

Ewan: Just to follow up Ewan, given the advanced age of many Royal Society members, have you thought about trying to assess what fraction of the Royal Society is functional?

Ewan: That's a fantastic question Ewan, very perceptive of you. This is something else that I have a strong interest in. I am currently involved in some preliminary discussions with various people to form a new pan-European working group that will investigate how much of the Royal Society is functional. This effort will hopefully be called ENCODEMBLIXIR…or something snappy like that. 


Jesting aside, congratulations Ewan this is great news!

101 questions with a bioinformatician #5: Laura Clarke

This post is part of a series that interviews some notable bioinformaticians to get their views on various aspects of bioinformatics research. Hopefully these answers will prove useful to others in the field, especially to those who are just starting their bioinformatics careers.

Laura Clarke is the Project Coordinator for Resequencing Informatics, part of the Vertebrate Genomics team led by Paul Flicek at EMBL-EBI. Before joining the EBI, she was applying her considerable bioinformatics skills at the Wellcome Trust Sanger Institute (a move ranked #1 on the annual list of Easiest-employers-to-transition-between). 

Her role sees her help with the analysis and coordination of high throughput genomics efforts such as the 1000 Genomes project, BLUEPRINT (deciphering the epigenome of blood cells), and HipSci (the Human Induced Pluripotent Stem Cells Initiative). If you're wondering what this actually entails, I'll hand you over to Laura:

"This work boils down to making sure that data gets into and out of the sequence archives; running primary analysis and QC; and then making sure the resulting analysis makes it out to the community".

You can find out more about Laura by following her on twitter (@laurastephen), and of course you can also follow @blueprint_eu and @hipsci. And now, on to the 101 questions...



001. What's something that you enjoy about current bioinformatics research?

The possibility. With modern sequencing technologies, computation techniques have the ability to draw together these new data types and massive volumes of data, allowing us to get much closer to a proper understanding of cellular biology, which of course brings us closer to understanding organismal biology.

Add to that the diverse range of species being sequenced and what that can teach us about evolution and the forces which drive evolution.

That is of course before you consider how it might impact medicine or food security or any real world applications.


010. What's something that you *don't* enjoy about current  bioinformatics research?

Extracting data from people. My life would be easier if people weren't so begrudging about sharing data and describing the data they do share well. I work with many people who do share data freely and easily but there are still too many people who are too reticent or reluctant to make data publicly available from within a consortium.


011. If you could go back in time and visit yourself as an 18 year old, what single piece of advice would you give yourself to help your future bioinformatics career?

For data coordination purposes we produce a lot of tab-delimited text files, cut is a wonderful Unix command for making those easier to work with and manipulate, learning about cut sooner would have at least made mucking about with various types of GFF files easier I suspect.


100. What's your all-time favorite piece of bioinformatics software, and why?

I have to say I did enjoy pairedends.com, very funny


101. IUPAC describes a set of 18 single-character nucleotide codes that can represent a DNA base: which one best reflects your personality?

R: this is because Adenine and Guanine are the same molecule type (purines) as both Theobromine and Caffeine, both of which are quite important to me and at least influence my personality.

An 18 Kbp read from a MinION sequencer!

The UC Davis Genome Center was fortunate to receive a few MinIONs from Oxford Nanopore the other week:

One of the things that we have been trying to do with these wondrous machines is to study variation in a mixed pan-European population. For this study, we simply combined saliva samples from individuals that represent 32 distinct European ethnicities (but no Belgiums, obviously), and the combined sample was applied directly to the MinION using the WF10 setting (WF = warp factor).

The preliminary results look very promising with an N50 read length of 12.2 Kbp (and this was before applying N50 Booster!!!). Here is the very first read from the device...18,731 bp of pan-European goodness (though note that there was a problem with base quality at the end of the read...contamination with Belgium DNA maybe?).



ACGT...TGCA — has every possible DNA-based initialism been used by the bioinformatics/genomics community?


Short answer


Long answer…

You might work in a field that's related to biology, genetics, genomics, or bioinformatics. You might be working on a new piece of software, or a research proposal, or you need to form a committee. Maybe you have even been given the power to name a new research facility.

Suddenly you have an inspiration...why don't we name our new software, proposal, committee, or facility after a DNA-based initialism! That would be clever and make us stand out from the crowd, right? Maybe...maybe not.

What follows is a fairly exhaustive list of — presumably intentional — DNA-based initialisms that are in use (or have been used). As of 2020-07-20 the current list contains 67 names in total with all 24 possible combinations of [ACGT] being used. The additions since I first created this page are included at the end.

See also this related blog post by David Lawrence from 2014, which I only discovered in mid-2020. His post — which beat me to the punch by just a couple of weeks! — has provided me with a few additional examples which I hadn’t heard about and which have now been included here.

Please let me know of any errors or omissions, though note that potential names have to be initialisms and has to be somewhat related to to the fields of genetics, genomics, or bioinformatics.


  1. Advisory Committee on Genetic Testing — Committee — 1996
  2. Alliance for Cancer Gene Therapy — Research Network — 2001
  3. A Comparative Genomics Tool — Software — 2003
  4. Advancing Clinico-genomic Trials on Cancer — Research Project — 2011
  5. Algorithms in Computational Genomics at Tau — Lab web page — ???
  6. Advanced Center for Genome Technology — Research Center? — ???
  7. African Centre for Gene Technologies — Research Network — ???
  8. Applied Computational Genomics Team — Research Group — ???
  9. Amino aCids To Genome — Software — 2017
  10. Analysis of Czech Genomes for Theranostics — Research Project? — 2020?


  1. Automatic Correspondence of Tags and Genes — Software — 2007


  1. Applied Genomics & Cancer Theraeputics — Research Program? — ???


  1. Applied Genomics Technology Center — Core Facility? — 1998
  2. Advanced Genome Technologies Core — Core Facility — ???
  3. University of Kentucky Advanced Genetic Technologies Center — Core Facility (now defunct?) — ???


  1. Applied Technology in Conservation Genetics — Research Lab — ???


  1. Arabidopsis Thaliana Genome Center — Core Facility? — 2000?
  2. Another Tool for Genome Comparison — Software — 2001
  3. Advanced Thermal Gradient dna Chip — Patent — 2002
  4. Another Tool for Genomic Comprehension — Database & web tool — 2012
  5. Alignable Tight Genomic Clusters - Database - 2009


  1. Center for Advanced Genomic Technology — Research Facility — 2000?
  2. Center for Applied Genetics and Technology — Research Facility — 2004
  3. Center for Applied Genetic Technologies) — Research Facility — ???
  4. Clustering AGgregation Tool — Software — 2012?


  1. Cross-legume Advances Through Genomics — Conference — 2004?
  2. Center for Advanced Technologies in Genomics — Research Facility — 2008


  1. Comparative Genome Analysis Tool — Software — 2006
  2. Computational Genomics Analysis and Training — Training program — 2010
  3. Computational Genomics Analysis Toolkit — Software — 2013
  4. Centre for Gene Analysis and Technology — Research Facility — ???
  5. Canadian Genome Analysis and Technology program — Research program (now defunct) — 1992


  1. CNS Gene therapy Translation Acceleration - Research Group - ???


  1. Corn Transcriptome Analysis Group — Working Group — 2014
  2. Canadian Triticum Advancement Through Genomics - Research project - 2011


  1. the Catalogue for Transmission Genetics in Arabs — Database — 2006


  1. The Center for Genetic Architecture of Complex Traits - Research Center - 2013


  1. Genetic Analysis Technology Consortium — Biotech Consortium (now defunct?) — circa 1997?


  1. Genome Comparison & Analytic Testing — Software? — ???
  2. Genome Consortium for Active Teaching — Teaching Consortium — 2007?
  3. Gene-set Cohesion Analysis Tool — Software — 2011 (or 2007) 4.Genotype-Conditional Association Test — Statistical method — 2015
  4. Genomics, Computational biology And Technology - study section - ???


  1. Genome-wide Complex Trait Analysis — Software — 2011


  1. Gene Technology Access Center — Teaching Facility — 2000
  2. Genomics Technology Access Center — Core Facility — 2009?
  3. Genome Technology Access Center — Core Facility — 2010
  4. Genomics/Transcriptomics Analysis Core — Core Facility — ???
  5. Genomes and Transcriptomes of Arctic Chromists — Research Program — 2012
  6. Gene Technology Advisory Committee — Government Committee — ???


  1. Genomic Tetranucleotide Composition Analysis — Database — 2006
  2. Genome Transcriptome Correlation Analysis — Software — 2007


  1. Talking About Computing and Genomics — Workshop — 2013


  1. The Applied Genomics Core — Core Facility — 1998
  2. The Ashkenazi Genome Consortium — Consortium — 2012
  3. Technological Advances for Genomics and Clinics — Research Lab/Program? — ???
  4. The Arts & Genomics Centre — An Arts/Science Center — ???
  5. The Allied Genetics Conference — Conference — 2016?
  6. Taxon-Annotated GC plots — software visualisation method/tool — 2013


  1. The Centre for Applied Genomics — Research Facility — 2007?
  2. The Center for the Advancement of Genomics — Research Facility (superseded by this) — ???


  1. The Centre for Genetic Anthropology — Research Facility — 1996
  2. The Tayside Centre for Genomic Analysis — Core facility — 2001 (?)
  3. The Center for Genomic Application — Core Facility — 2004
  4. The Cancer Genome Atlas — Research Program — 2006


  1. The Genome Access Course — Training Course — 2002
  2. The Genome Analysis Center — Research Facility — 2009


  1. The Genome Counselling App — iOS Application — 2014


  • 2020-08-20 Added 5th example of ATGC, 3rd example of AGTC, 2nd example of CTAG, and 4th example of GCAT (all courtesy of David Lawrence)

  • 2020-07-18 Added 10th example of ACGT

  • 2019-07-23 Added 9th example of ACGT (thanks to Sam Lent @samanthalent)

  • 2016-09-03 Added 4th example of TCGA (thanks to @malcolmacaulay)

  • 2016-02-16 Added 6th example of TAGC

  • 2015-09-11 - Added 5th example of TAGC

  • 2015-07-06 - Added 8th example of ACGT

  • 2015-04-06 - Added 4th example of GCTA (thanks to John Didion)

  • 2014-12-12 - Added first usage of TACG (thanks to @NazeefaFatima)

  • 2014-04-25 - Added Jeff Ross-Ibarra's planned use of CTAG

  • 2014-04-25 - Included a second instance of AGTC

  • 2014-05-18 - Included a fourth example of TAGC

  • 2014-09-08 - Included first usage of CGTA, GACT, and TGCA