Friday, 28 November 2014

New paper: bacterial phylogenetic inference is robust to recombination but demographic inference is not

Published this week in mBio, Jessica Hedge's new paper "Bacterial phylogenetic inference is robust to recombination but demographic inference is not" looks at a long-standing problem: why are phylogenetic trees so popular in bacterial genomics when everyone knows recombination (which is detectable in most species studied) leads to seriously misleading inference? A burst of research activity in the early 2000s showed that homologous recombination - which can result from various forms of horizontal gene transfer in bacteria - can distort phylogenetic trees and lead to false inference of positive selection and demographic growth in methods that rely on them.

In the intervening years there has been intense research in the field of population genetics into approaches that account for recombination, although the practically useful methods rely on approximations because of the inherent difficulties of learning about complex reticulated evolutionary networks that recombination generates. This has led many of my population genetics colleagues to regard - at least privately - the use of phylogenetic trees in recombining species as "bust", and the conclusions drawn from such studies as questionable. In this paper we show that this view is too simple.

FIG 1 

Friday, 6 June 2014

Cheltenham Science Festival

Earlier this week members of the group represented the Nuffield Department of Medicine at the Cheltenham Science Festival with our Modernising Medical Microbiology stall, featuring the Antibiotic Resistance Coconut Shy and the Genome Evolution Dance Mat.

Antibiotic Resistance Coconut Shy
Antibiotic Resistance Coconut Shy: The children (and adults) visiting the stall were given five bean bags (antibiotics) to throw at the coconuts (bacterial pathogens) to try to knock them off. The front row of coconuts, representing bacteria more susceptible to antibiotics, were easier to knock off than the back row, which represented more resistant bacteria. The aim was to show the children that an unwanted side effect of using antibiotics is to increase the frequency of resistant bacteria, because they were usually the ones left standing.

The game was more difficult than it looks, and just one visitor knocked off all five coconuts. We gave out NDM pens to the sixty visitors who managed to knock off three or more.

Microscope and Top Trumps
Digital Microscope: We brought along a light microscope to show the children what bacteria really look like, which helps emphasize how small they are since they are difficult to see even under the highest magnification. We prepared slides for several Gram positive and Gram negative species, and provided a key to help identify them. We also brought along a number of games that have been used in previous departmental outreach activities, including Pathogen Top Trumps and Fact or Fiction.

Genome Evolution Dance Mat
Genome Evolution Dance Mat: In this game, the children had to copy a bacterial DNA sequence by replicating a sequence of dance moves (up=A, left=C, right=G, down=T) without introducing new errors (mutations). Any mutations that were introduced were passed on to the next template sequence. In this way we aimed to show how mutations occur by errors in DNA replication, and that they are inherited. This generates unique DNA fingerprints for bacteria, which we can use to track the spread of outbreaks.

Outbreak Map
The game, which was kindly programmed by Gareth Jenkin-Jones, included a form of natural selection, so that if too many errors were introduced at once, the sequence was considered inviable and did not survive to be passed on. There was also a speed control, which was handy since some people appear to have spent a lot more of their youth playing dance mats than others.

Outbreak Map: We made an Outbreak Map to show the reach of our stall over the day, with visitors that scored highly on the coconut shy pushing in pins to show where they had travelled from. Had we been handing out germs instead of pens, we could have started outbreaks as far afield as Edinburgh, France and Spain, as well as a large cluster in Cheltenham and the surrounding counties.

Other research groups are representing the department throughout the week.

NDM Microbiology Stall at the Cheltenham Science Festival (L-R): Sarah Earle, Louise Pankhurst, Danny Wilson, Liz Batty, Dilrini De Silva, Jess Hedge, Catrin Moore. Amy Mason, Gareth Jenkin-Jones and Jane Charlesworth also helped with the preparations, and Jen Bardsley co-ordinated all the NDM Stalls.

New paper: Mobile elements drive recombination hotspots in the core genome of Staphylococcus aureus

This week published in Nature Communications we have a new open access paper looking at what drives variability in rates of recombination (horizontal gene transfer, HGT) in the core genome of Staphylococcus aureus. HGT in the core genome is important for eliminating harmful mutations and promoting the spread of beneficial mutations, such as those that make the bacteria resistant to antibiotics.

Compared to recent work focusing on individual, highly-related strains of S. aureus, we found much higher rates of core HGT across the species as a whole. We saw that the frequency of HGT varies along the genome. At broad scales, core HGT is higher near the origin of replication, a pattern reminiscent of the one described by Eduardo Rocha and colleagues in E. coli, who hypothesized that the over-abundance of DNA near the origin during rapid growth could promote HGT.

At fine scales, we found more frequent HGT in regions of the core genome close to mobile elements. The hottest regions occurred near mobile regions called ICE6013, SCC and genomic island α. The insertion and excision of mobile elements from the genome represents a type of HGT, so our finding that nearby core regions also experience more HGT suggests there is some sort of "spill over". This idea is supported by work in Ashley Robinson's group that found similarities between ICE6013 and a class of mobile elements in Streptococcus agalactiae called TnGBS2. TnGBS2 was discovered by Phillipe Glaser's lab who showed it sometimes transfers large tracts of adjacent core material during conjugation.

Whether conjugation alone can explain the high levels of core HGT we saw in S. aureus is unclear - our results suggest there is detectable HGT even in core regions far from mobile elements. Transformation is another possible mechanism of core HGT, but S. aureus is generally thought to be naturally incapable of transformation. However, intriguing work published by Tarek Msadek and colleagues in 2012 indicates there may be cryptic mechanisms of transformation in S. aureus after all. It remains to be seen whether the relative contributions of transformation, transduction and conjugation to the long-term evolution of S. aureus can be disentangled.

Tuesday, 1 October 2013

The role of hospital transmission in Clostridium difficile infection

This week the Modernising Medical Microbiology consortium at Oxford published the findings of a six-year study into the transmission of the hospital "superbug" Clostridium difficile. The research, which appears in the New England Journal of Medicine, shows that the majority of new cases cannot be traced to other infections in hospital, and indicates instead that there must be a large, as yet unidentified, reservoir of C. difficile infectious to humans. This finding is important because it suggests that there is a limit to which more and more intense hospital cleaning - important though it has been - can continue to have in reducing C. difficile infection.

The research, which is the result of a tireless effort by a large number of my colleagues - notably David Eyre, Tim Peto and Sarah Walker - used bacterial whole genome sequencing to detect within-hospital transmission by searching for extremely closely related bacterial strains among more than 1200 cases of C. difficile infection that occurred in Oxfordshire between September 2007 and March 2011. The consortium is currently developing the approach for routine microbiology diagnostics and infection control, with a view to eventual roll-out across the NHS.

Friday, 20 September 2013

Postdoctoral Position in Statistical Genomics

The position of Postdoctoral Scientist is available in my group to lead research on the Wellcome Trust and Royal Society funded project Statistical Methods for Whole Genome Phenotype Mapping in Bacterial Populations.

Bacteria cause disease throughout the world. Different strains vary in disease severity, but the genetic variants responsible remain largely undiscovered. Recent breakthroughs in whole genome sequencing provide new opportunities for discovery, but the lack of statistical analysis tools tailored to the special structure of bacterial populations presents a roadblock. The goal of the project is to develop an analysis framework for mapping genes underlying naturally variable traits in bacterial populations. Focusing on the hospital-associated pathogens Staphylococcus aureus and Clostridium difficile, we will investigate the role of bacterial variants on disease severity.

The role of the Postdoctoral Scientist is to develop novel statistical methods for analysing genotype-phenotype associations in bacteria at the whole genome level. The successful candidate will write software implementing the statistical methods and apply them to design and carry out investigations into the genetic basis of virulence in natural populations of bacterial pathogens. The ideal candidate would be a recently graduating PhD student with experience of statistical genetics and computer programming, with evidence of publicly released software. Experience of population genetics or microbiology would be advantageous but is not essential.

The post is available immediately, and is available for up to 3 years in the first instance. For more details on this position, including salary, job description, selection criteria and how to apply, please see the University of Oxford recruitment page.

Applications for this vacancy are to be made online. The closing date is 12.00 noon on Monday 4 November 2013. Applicants will be asked to upload a CV and a supporting statement as part of the online application. For informal enquiries, please email me. More information about the group's research is available here.

Tuesday, 17 September 2013

Sir Henry Dale Fellowship

I am pleased to report that I have been awarded a Wellcome Trust and Royal Society funded Sir Henry Dale Fellowship. The subject of the fellowship, to be held in the Nuffield Department of Medicine at the University of Oxford, is Statistical Methods for Whole Genome Phenotype Mapping in Bacterial Populations.

The project addresses the question of how to detect genes or mutations in bacteria responsible for variability in important traits such as the tendency to cause human disease. Focusing on the hospital-associated pathogens Staphylococcus aureus and Clostridium difficile, the project has the potential to help identify genetic variants that explain why some bacteria cause more severe infections, knowledge that could help develop new drugs and tests that improve patient treatment.

The fellowship runs for five years, and includes support for a postdoctoral research assistant and laboratory costs. I will be advertising a position shortly. If you are interested, please get in touch.

I want to thank the funders and reviewers for supporting this project, and my colleagues who helped me write and re-write the research proposal.

Thursday, 6 June 2013

Detecting mixed strain infections with whole genome sequencing

Whole genome sequencing in near-to-real time is set to become a routine tool for outbreak detection by hospital and public health microbiology labs, following successful pilot studies in the UK last year. Typically, the bacteria are cultured from a clinical sample, and a single colony is picked for sequencing. Since a bacterial colony grows from a single cell, this procedure ensures that all the cells picked for sequencing are genetically identical, and this in turn helps piece the genome back together again following sequencing.

But it exposes the system to a flaw. What would happen if a patient sick with two strains transmitted one, but not the other to a second patient? Characterizing the genome of just one of the strains in the first patient risks missing the transmission event entirely, because the "wrong" strain might have been sequenced.

One safeguard would be to sequence multiple bacterial colonies per sample, three for example. But this would increase the cost of routine surveillance three-fold.

In a new paper published this month in PLoS Computational Biology, with David Eyre, Madeleine Cule, Sarah Walker and others, we have investigated an alternative solution, where by a large number of colonies gets sequenced all together. The cost is the same as that of sequencing a single colony. But the downstream bioinformatics analysis is complicated considerably by the presence of multiple strains. To cope with this, we developed a new computational method that reconstructs the identities of the multiple strains, using a panel of reference genomes to help where possible.

By applying the approach to 26 clinical samples of Clostridium difficile hospital infections with known epidemiological relationships, we detected four mixed strain infections, one of which revealed a previously undetected transmission event within the hospital. For full details, read the open access paper.

Wednesday, 22 May 2013

Within-host evolution of Staphylococcus aureus during asymptomatic carriage

Given its notoriety as one of the world's major causes of infection-related deaths, it may come as a surprise that one in three healthy adults carry the human pathogen Staphylococcus aureus in their noses without adverse effects. Indeed, most people carry the bacteria at some point in their lives. So carriage must be seen as the normal state of affairs in the human-S. aureus interaction, and by understanding this state better we can improve our understanding of why, in some people, the bacteria go on to cause life-threatening invasive disease.

This month sees publication of an investigation by my colleagues and me into the evolution of S. aureus during this normal healthy carriage state. The carriers in our study harboured populations of the bacteria that were very closely related but typically not identical, implying that the bacteria had evolved within the human body. The nose appears to be a microcosm of evolution for S. aureus, showing all the different types of genetic variation known at the species level within the noses of these individual carriers. For the most part, within-host evolution of the bacteria was very conservative, but certain proteins expressed on the surface of the bacteria and toxins secreted by the bacteria showed evidence of involvement in a host-pathogen arms race.

The paper, whose lead authors include Tanya Golubchik, Liz Batty, Derrick Crook and Rory Bowden, has received coverage on the EveryONE blog and F1000. I liked Gerald Pier's conclusion, made on the post-publication peer review website: "Given that about 30% of the world's seven billion-plus humans, and an unknown number of animals, are chronically colonized with S. aureus, the tremendous opportunity provided to this organism for generating genetic variation to counteract human efforts to prevent S. aureus infections may be one of the most formidable barriers to overcome in order to develop vaccines and highly effective interventions to lessen the impact of this organism on human and animal health."

Monday, 4 February 2013

Coalescent inference for infectious disease

Today my student Bethany Dearlove has her first paper published, called Coalescent inference for infectious disease: meta-analysis of hepatitis C. In this paper, published in Philosophical Transactions of the Royal Society B, we have developed coalescent-based population genetics methods for popular, deterministic, epidemiological models known as SI (susceptible-infectious), SIS (susceptible-infectious-susceptible) and SIR (susceptible-infectious-recovered). By implementing these methods in BEAST, we were able to re-analyse previously published hepatitis C virus datasets and directly estimate epidemiological parameters. Our results show that, in the absence of co-infection, the widely-used exponential growth and logistic growth models of changing population size correspond directly to SI and SIS dynamics. We were also able to examine the limitations to genetic approaches to reconstructing epidemiological dynamics.

This paper appears as part of an issue on Next-generation molecular and evolutionary epidemiology of infectious disease, which accompanies a Royal Society discussion meeting organized by Oli Pybus, Christophe Fraser and Andrew Rambaut. The Royal Society has made audio recordings of the talks at this meeting, and the accompanying satellite meeting, available online, including my talk on Bethany's paper.

Thursday, 15 November 2012

Postdoctoral Positions in Pathogen Genomics

These positions are now closed. There are currently seven posts advertised to join the Pathogen Genomics group at the Nuffield Department of Medicine in Oxford. Prof Derrick Crook and colleagues are seeking exceptional, creative, quantitatively minded scientists to join a multidisciplinary team of researchers using population genomics to understand the evolution and transmission of human pathogens. We are seeking to appoint a number of promising young researchers to extend our existing strengths in the areas of phylogenomics, statistical genetics and bioinformatics.

The group is studying a range of bacterial and viral pathogens including tuberculosis, Staphylococcus aureus, Clostridium difficile, HIV, norovirus and hepatitis C virus. Our research interests include within-host evolution, the genetic basis of virulence, transmission dynamics and outbreak investigation via real-time genomics.

A major translational goal of the project is to exploit the transformative effect of population genomics on bacteriology to improve routine clinical practice in public health and microbiology laboratories.

The research is supported by the UKCRC Modernising Medical Microbiology Consortium, the Health Innovation Challenge Fund, the NHS National Institute for Health Research, the Oxford Biomedical Research Centre, Institut Merieux and the Oxford Martin School, and pursued in collaboration with clinical colleagues in Leeds, Birmingham and Brighton, the Health Protection Agency and the WTSI.
The deadline for applications varies by position, between 26-28 November 2012.
For examples of recent papers see:

For more information visit:

Monday, 5 November 2012

James Martin Fellowship

This position is now closed. A prestigious James Martin Fellowship funded by the Oxford Martin School is available in my research group for a highly motivated and creative population geneticist interested in developing cutting edge methods for the analysis of high-throughput whole genome sequencing data to better understand the evolution and epidemiology of the major pathogens HIV and Hepatitis C Virus.

The position, which is part of the Curing Chronic Viral Infections project, is fully funded for three years and is affiliated with the Institute for Emerging Infections, the Modernising Medical Microbiology consortium, the Peter Medawar Building for Pathogen Research and the Nuffield Department of Medicine. The ideal candidate will have a track record in statistical or computational genetics and experience of programming in a language such as C++ or Java.

Full details can be found on the University of Oxford Recruitment website. Please send informal enquiries, with a CV, to me by email. The deadline for applications is 12 noon on 27th November 2012.

Friday, 7 September 2012

PLoS Pathogens Review Published!

Published today in PLoS Pathogens:

Thursday, 16 August 2012

gammaMap available for download

The software gammaMap - which implements the analyses developed in Wilson, Hernandez, Andolfatto and Przeworski (2011) PLoS Genetics 7: e1002395 - is available for download. It is provided as part of a flexible program called GCAT (general computational analysis tool) which is designed to rapidly facilitate novel variations on the standard analyses. GCAT has its own google code page, GCAT resembles BEAST and BUGs in that a statistical model is specified (using XML) and parameters are then estimated using MCMC or maximum likelihood. Future extensions to GCAT are planned that implement new fast approximations to gammaMap and omegaMap, and parallel processing, allowing the analyses to be scaled more readily to whole genomes.

Monday, 13 August 2012

Nature Reviews Genetics: Transforming Clinical Microbiology

My colleagues Xavier Didelot, Rory Bowden, Tim Peto, Derrick Crook and I have just published a review online ahead of print in Nature Reviews Genetics called Transforming clinical microbiology with bacterial genome sequencing.

You might also be interested to read a similarly themed review recently published by our friends at the University of Cambridge and Wellcome Trust Sanger Institute in PLoS Pathogens titled Routine use of microbial whole genome sequencing in diagnostic and public health microbiology.

These review articles follow hot on the heels of a pair of research articles published by our two groups: A pilot study of rapid benchtop sequencing of Staphylococcus aureus and Clostridium difficile for outbreak detection and surveillance in BMJ Open and Rapid whole-genome sequencing for investigation of a neonatal MRSA outbreak in the New England Journal of Medicine. The common thread is the impact of near-to-real-time whole genome sequencing on outbreak detection and other translational activities in hospitals and public health laboratories.

Friday, 20 July 2012

Post-doc Positions in Pathogen Genomics

Post-doc positions in Pathogen Genomics are available in my group and Derrick Crook's lab. We will be hiring people to work on pathogen whole genome sequence analysis and bioinformatics. More details available soon. In the meantime, find out about our research:
If you are interested, please get in touch.

Monday, 5 March 2012

PNAS paper on staphylococcal evolution during infection

Today in PNAS Early Edition my colleagues and I have a paper published reporting the genome evolution of Staphylococcus aureus during the transition from prolonged nasal carriage to invasive disease. Since Staph. aureus, a major bacterial cause of life-threatening infections, is carried without symptoms by a quarter of healthy adults, a natural question is to ask what genetic changes - if any - accompany the transition to invasive disease. The opportunity to pursue this question arose from a detailed epidemiological investigation of asymptomatic Staph. aureus nasal carriage set up by colleagues of mine including Derrick Crook and Kyle Knox. The study has recruited over 1,000 participants in Oxfordshire since it began running in October 2008. One participant developed a bloodstream infection that was indistinguishable from the strain of Staph. aureus persistently carried in the nose for the previous 13 months. Members of the Modernising Medical Microbiology consortium, led by Derrick and Rory Bowden, sequenced the genomes of 68 bacterial colonies isolated from the nasal and blood samples from this participant, and 101 colonies from nasal samples from two other participants that did not go on to develop disease. Bernadette Young and Tanya Golubchik analyzed the genome evolution of these bacterial populations, discovering an unusual pattern in the mutations that occurred between nasal carriage and invasive disease: mutations that led to prematurely truncated proteins were significantly over-represented, including one in a gene previously associated with virulence in bacteria. To know more, read the full open access article.

Wednesday, 11 January 2012

SMBE 2012: Microbial Genome Evolution Symposium

Along with my colleagues Xavier Didelot, Ed Feil, Eduardo Rocha and Howard Ochman, I will be organizing a symposium on Microbial Genome Evolution at the 2012 meeting of the Society for Molecular Biology and Evolution in Dublin, Ireland. The deadline for abstract submission is 27th January 2012. This is the synopsis for our symposium:
High-throughput sequencing makes it possible for the first time to sequence hundreds of microbial genomes rapidly at low cost. These methods have huge potential to significantly improve our understanding of microbial evolution, so that many research projects have recently been set up to generate and analyze such data. This symposium will provide an overview of the progress made by such projects, as well as the many challenges they pose. It is now possible to identify the vast majority of SNPs within large population samples of microbial isolates. These datasets are illuminating the molecular, ecological and population-level dynamic processes occurring over short time scales in natural populations inhabiting a range of habitats from the clinic to the environment. We aim to  explore these recent advances and the development of new methods of analyses required to fully exploit these extremely large sequence datasets. Relevant topics include quantifying the variation in the rates of recombination and mutation between closely related lineages, the evolution of base composition, the relative power of drift and selection, examining the acquisition of adaptive traits (e.g. antibiotic resistance, host adaptation, metabolic flexibility, regulatory changes) within a phylogenetic framework, and the distribution of variation over time and space (phylogeography). The role of phage and conjugative elements in structuring populations as both vehicles for gene flow and parasitic elements will also be considered. The symposium will focus on variation within natural populations rather than experimental evolution.

Friday, 2 December 2011

New method inferring natural selection published today

I am pleased to report that my new paper "A population genetics-phylogenetics approach to inferring natural selection" is published today in PLoS Genetics. This is the culmination of two years work at the University of Chicago with Molly Przeworski, plus a good deal of follow-up since I moved to Oxford. In the paper we introduce a new way of combining population genetics and phylogenetics models of natural selection, and a statistical method (gammaMap) for estimating parameters under the model. From a collection of sequences within one or more species - in the paper, we use 100 X-linked coding sequences that Peter Andolfatto produced in Drosophila melanogaster and D. simulans - the method allows you to estimate the distribution of fitness effects within each lineage, and localize the signal of selection using a Bayesian sliding window approach. Using Ryan Hernandez's simulator SFSCODE we tested the method for robustness to demographic change and linkage disequilbrium, and we investigated the effect that common assumptions concerning spatial variation in selection coefficients (sitewise, genewise and sliding window approaches) have on inference of selection. During the winter break I will work on compiling the program for different platforms and writing the documentation, with a view to releasing the software early in the New Year. Subscribe to this blog for updates or - if you are too impatient to wait - send me an email.

Saturday, 1 January 2011

Group Member Profiles Updated

Richard Everitt and Bethany Dearlove, postdoctoral scientist and D.Phil. student in my lab have posted their profiles and research interests to my website. Both joined in October, Richard from the University of Bristol where he was Brunel Fellow in Statistics and Bethany from the University of Reading where she read a masters in Biometry.

Richard is investigating patterns of genetic diversity and linkage disequilibrium in Staphylococcus aureus, while Bethany is studying the transmission dynamics of norovirus using population genetics and epidemiological modelling. Both are funded jointly by the UKCRC project Modernising Medical Microbiology and the Nuffield Department of Clinical Medicine. For more information, see their individual profiles.

Friday, 1 October 2010

Geographical differences in transmission revealed by cryptic population structure

Two papers that I co-authored with colleagues at Lancaster and Massey Universities appear this month in the October 2010 issue of Epidemiology & Infection. The common theme is that cryptic differences in the population structure of the enteric pathogen Campylobacter jejuni, revealed by my method for attributing cases to source populations, suggest subtle differences in transmission between rural and urban districts.
The method, implemented in the software iSource (available on my website), allows strains of campylobacter to be characterized as poultry- or cattle-associated based on their genetic profiles. Interestingly, when the relative incidence of poultry- and cattle-associated strains is plotted on a map, there is a significantly higher occurrence of poultry-related disease in urban areas and cattle-related disease in rural areas. Both studies – one in Lancashire led by Edith Gabriel and one in New Zealand led by Petra Mullner – draw the same conclusion. These findings imply that there are subtle differences in transmission in rural and urban areas. Whether they represent geographical differences in the profile of food pathogens, environmental exposure, resistance to infection or other risk factors is not understood.

Saturday, 18 September 2010

Evolutionary Genetics for Translational Research

This month saw the 2010 Infectious Disease Genomics & Global Health meeting at Hinxton, which attracted a good number of people involved in the Modernising Medical Microbiology consortium, of which I am a participant. Rory Bowden and Rosalind Harding presented our group's progress on piecing together intra-host evolution of Staphylococcus aureus and reconstructing transmission chains in Clostridium difficile. My role in the projects has so far been one of assisting in ongoing evolutionary analyses and collaborating in the design of bioinformatics pipelines to make sense of the raw Illumina short-read sequencing data. At the same time I have been devising research plans for my own group, and spending time in the lab preparing sequencing experiments with Bernadette Young. In the poster I presented at Hinxton (available here), and at an internal talk I gave earlier in the year (slides here) I set out what I see as the strengths of Evolutionary Genetics for addressing translational medical problems including
  • Tracking the transmission of hospital-acquired pathogens
  • Understanding transmission dynamics at the population level
  • Identifying the mechanistic and adaptive basis of disease
  • Explaining how pathogens emerge, persist and spread globally
Of the many stimulating talks at the Hinxton conference, those by Dominic Kwiatkowski on the population genomics of Plasmodium falciparum, Christophe Fraser on "hyper-recombination" in Streptococcus pneumoniae and Paul Keim on the challenges for understanding the population genetics of non-clonal bacterial pathogens particularly interested me. Prof Keim gave an equally captivating talk the following day at the Health Protection 2010 meeting in Warwick on his microbial forensics work tracing the origin of Bacillus anthracis spores used in bioterrorism attacks. What I especially admired about his presentations was the dogged pursuit of new methods and ways of thinking in order to better address the biological questions at hand.

Saturday, 10 July 2010

What are the conditions for multiple foci of adaptation?

Selection on standing variation, soft sweeps, parallel adaptation: these alternatives to the population genetics paradigm of the S-shaped selective sweep have in common the idea that the response of a species to a change in selection pressure may frequently involve multiple mutations, which may arise in multiple locales, and which may appear at different sites in the genome. Consequently, the footprint of selection in the genome is different to that expected under a single selective sweep and therefore likely to be missed by scans of the genome looking for selection.

Many examples of parallel adaptation have been put forward, for instance multiple drug resistance in the malaria parasite Plasmodium vivax. But how plausible is parallel adaptation as an evolutionary mechanism, and what are the conditions that make it likely? These questions were addressed by Graham Coop presenting joint work with his postdoc Peter Ralph in one of the stand-out talks of the SMBE conference in Lyon.

Their key finding is that the multifarious parameters that go into building a spatial model of adaptation (strength of selection, the mutation rate, population density, average dispersal distance of offspring) can be distilled down to a single key quantity: the characteristic length given by the equation
When the geographical extent of the species range exceeds this characteristic length, the conditions are right for parallel adaptation. Graham's talk made accessible the complex mathematics behind this result. He has kindly made the slides available (click here) and the paper is now available at the Genetics website (click here).

Thursday, 8 July 2010

Discovering the distribution of fitness effects

At this year's Society for Molecular Biology and Evolution meeting in Lyon I presented ongoing work estimating the distribution of fitness effects, which is a collaborative venture with Molly Przeworski and Peter Andolfatto. Earlier versions of this research appeared in talks I presented at Chicago in December (Ecology and Evolution Departmental seminar) and Liverpool in January (UK Population Genetics Group meeting), and it follows on from last year's SMBE presentation in which I discussed methods to tease out sub-genic variation in selection pressure.

There is intrinsic interest in the fitness effects of novel mutations in coding regions of the genome, especially the relative frequency of occurrence of neutral, beneficial and deleterious variants. Yet estimating the distribution of fitness effects (the DFE) is also of practical use when localizing the signal of adaptive evolution. The reason is that in Bayesian analyses, the assumed DFE can influence the strength of evidence for or against adaptation at a particular site. Consequently it is preferably to estimate the DFE at the same time as detecting adaptation at individual sites to avoid prior assumptions unduly influencing the results.

Having estimated the DFE, it is of use in quantifying the relative contribution of adaptation versus drift to genome evolution. The figure, taken from my talk in Lyon (slides here), illustrates the idea when a normal distribution is used to estimate the DFE; the relative area of the green to the yellow shaded regions represents the respective contribution of adaptation versus drift in amino acid substitutions accrued along the Drosophila melanogaster lineage.

Friday, 30 April 2010

Election to St. John's College

I'm pleased to have been elected to membership of the SCR at St. John's College, my alma mater. The college is an important aspect of life at Oxford as it gives an alternative centre of gravity outside the department for participation in social and academic activities. As a member of University staff with solely research responsibilities, it is a welcome opportunity to interact with the community of teaching fellows, students and junior researchers who belong to the college.

The picture is of the Spring crocus lawn taken in the college gardens.

Saturday, 27 February 2010

Postdoc and PhD position available

These positions are now closed.

Advertised today in Nature and on Thursday in New Scientist are two positions in my lab. I am looking for a postdoc and a PhD student to work on the genome evolution and epidemiology of four human pathogens as part of the Modernising Medical Microbiology project. Three of the pathogens share the theme of hospital-acquired infections: they are Staphylococcus aureus (of MRSA infamy), Clostridium difficile and norovirus (aka winter vomiting disease). The fourth is Mycobacterium tuberculosis (TB) which is a re-emerging problem in developed countries.

The aim of the project is to use whole genome sequencing of many isolates (100s to 1000s) in order to reconstruct evolutionary relationships and deconstruct transmission routes. We hope to develop the technology to the stage that we can trace the spread of pathogens in real time, and uncover the epidemiological triggers for the spread of disease.

As of January I have relocated to the Nuffield Department of Clinical Medicine at the University of Oxford, and the project is a collaborative affair between people at Oxford (including Rory Bowden, Derrick Crook, Peter Donnelly and Rosalind Harding), the Wellcome Trust Sanger Institute, the NHS and the Health Protection Agency. The project is funded by the UKCRC and further details of the positions are available online for the postdoc and PhD studentship. The closing date for applications is Friday, 2 April 2010.

Sunday, 7 February 2010

Holding early human stone tools

Today I had an extraordinary experience, precipitated by my visit to the British Museum on something of a whim. Listening to the Radio 4 series A History of the World in 100 Objects, my imagination had been captured by the descriptions of early stone tools - a chopper and a hand axe - featured in the first couple of programmes in the series. These tools, which were found in the Olduvai Gorge, in modern-day Tanzania, are examples of the oldest known objects made by humans. What is fascinating is that their simple design belies a capacity for mental forethought. They are tangible evidence that the humans living 2 million years ago had the intelligence to conceive of and the dexterity to manufacture tools.
I had been visiting friends in London, and before leaving I decided to pass by the museum to see these relics for myself. I found the stone tools in a dim room in the near corner of the museum, shielded by glass cases. After reading the descriptions and wandering round I noticed a lady showing some children a bunch of similar-looking objects she had in a wooden box. I asked if they were casts and could hardly believe it when she told me it was the real thing. Two stone hand axes, 1 million years old, made from basalt and quartz, and a basalt chopper, 2 million years old - the oldest items in the museum. To hold in the palm of my hand a tool fashioned 2 million years ago by a cognizant proto-human, I could imagine the heavy object fitting just as neatly into the hand of its designer, and in trying to understand the way it might have been used to butcher carcasses, pound meat and scrape flesh off bones I felt I got a brief glimpse into the intentions of its designer. The study of evolution rarely affords such vivid connections with its subject matter, and I felt privileged to stumble across such an encounter today.

Monday, 16 November 2009

Campylobacter source attribution in New Zealand

What is the source of the common food poisoning pathogen Campylobacter jejuni was the subject of a paper published in September last year in PLoS Genetics by my colleagues and I, in which we traced the origin of bacterial isolates collected from patients in Lancashire, England. In that study, and a subsequent investigation into campylobacteriosis across Scotland, we found that the majority of cases could be attributed to populations of C. jejuni typically found in poultry.

Now Petra Mullner, Nigel French and colleagues have genetically characterized the C. jejuni populations found in human patients, cattle, sheep, poultry and environmental samples from New Zealand covering the period March 2005 - February 2008. What is special about their study is that the New Zealand poultry industry is a closed system, with no foreign imports, making it possible to directly sample the putative source populations and disease-causing isolates concurrently.

Like the studies in England and Scotland, poultry was the inferred source of the majority of disease in New Zealand. Uniquely however, it was possible to attribute cases separately to the three major poultry suppliers on the islands. One supplier in particular was attributed a disproportionate number of cases using 3 assignment methods, including my method (iSource, soon to be available on this website). Supported in part by this evidence, the New Zealand Food Safety Authority introduced mandatory targets for limiting Campylobacter contamination of poultry products in 2007. Remarkably, the number of cases fell from 15,873 in 2006 before the control measures were introduced to 6,689 in 2008. The next chapter of this intriguing story will be a follow-up study to establish whether the fall in the number of cases corresponded to a reduction in the proportion of campylobacteriosis attributable to poultry sources.

Selection in a putative meningitis vaccine target

In Variation of the factor H-binding protein in Neisseria meningitidis, Carina Brehony in Martin Maiden's lab at Oxford investigated a group of outer membrane proteins in the bacterium responsible for meningococcal meningitis. To date, attempts to raise a vaccine against the common serogroup B meningococci have been frustrated by the low immunogenicity of the serogroup B capsular polysaccharide, despite success with serogroups A and C. Outer membrane proteins, such as factor H-binding protein (fHbp) may provide alternative targets for vaccine development.

However, fHbp is genetically diverse, and our investigation showed evidence of structuring into three groups. OmegaMap analyses of the three groups revealed a signature consistent with strong selection pressure for antigenic variability at the gene. Notably, there was clear evidence of diversifying selection at several previously discovered epitopes - positions in the protein targeted by antibodies during bacteria-killing immune response. (Analysis of one group is shown in the figure, with known epitopes marked).

While these observations are encouraging in terms of understanding the biology of pathogen antigens, a pressing question is how do we translate that understanding into practical vaccine design? Studies such as ours suggest a multi-component vaccine may be necessary to achieve broad coverage against serogroup B meningococci.

Recombination and proper segregation in human meiosis

My blog entries have lapsed since the summer while I have attempted to press on with various projects to tie up as much as possible by the end of the year. Meanwhile, my collaborators and I have had three papers published.

In Broad-scale recombination patterns underlying proper disjunction in humans, Adi Alon and colleagues have used a large Hutterite pedigree to test two molecular hypotheses in a statistical genetics fashion. Crossing-over is important for proper segregation of chromosomes during meiosis. When chromosomes fail to segregate properly, the result is aneuploidy, a genetic pathology underlying many inherited diseases; for example, aneuploidy at chromosome 21 is often the basis of Down's syndrome.

It has been suggested that a hard limit of at least one crossover per chromosome is necessary for correct disjunction; others have suggested the requirement is for one crossover per chromosome arm. By reconstructing the probable distribution of the number of crossovers during meiosis, we were able to show that proper disjunction frequently occurs in humans in the absence of a crossover every chromosome arm. Further, the evidence suggested that successful segregation of some chromosomes can occur without a crossover at all - interestingly chromosome 21 was flagged up among others. This leads to the question, is there a back-up cellular mechanism to rescue meiotic division when crossovers fail to form?

Thursday, 18 June 2009

SMBE Iowa City

I spent the beginning of the month at the SMBE (Society for Molecular Biology and Evolution) conference in Iowa City. It was a good chance to catch up with people and find out what research is going on in the field, as well as to speak with collaborators about on-going projects. One of those is Peter Andolfatto, who works on genome evolution in Drosophila species. Molly and I are collaborating with Peter on a project to detect natural selection within and between Drosophila species. The main idea is to improve inference by taking into account variation in selection pressure throughout the gene. Our method draws on the advantages of a number of current approaches such as Rasmus Nielsen and Ziheng Yang's codeml package (part of PAML), Carlos Bustamante's MKPRF (McDonald-Kreitman Poisson Random Field) model and Gil McVean and my program omegaMap in that it exploits patterns of polymorphism within and between species, while allowing for conservation and adaptation within the same gene. You can view the slides of my SMBE talk here, which was titled "Adaptive events in hominid (and Drosophila) evolution".

Monday, 25 May 2009

Science Bomb!

Figure 1 of Venkatarama et al (2009)On Friday Chris Spencer gave the PPS (Pritchard/Przeworski/Stephens) lab meeting as part of a trip to Chicago. Chris talked about his work in Oxford on association studies in a number of common genetic diseases being studied by the Wellcome Trust Case Control Consortium.

Beforehand I dropped the Science Bomb, a new innovation this year (for which I think Barbara Engelhardt is responsible) where someone talks about a particularly interesting or timely article. Dan Gaffney pointed me in the direction of a PLoS Biology paper titled Reawakening Retrocyclins: Ancestral Human Defensins Active Against HIV-1.

The subject of the study is a human pseudogene known as retrocyclin, which has been shown to confer resistance to HIV-1 infection in human cell lines. The pseudogene is expressed naturally in several human tissues, but not translated into protein owing to a premature stop codon. The paper's authors reawakened retrocyclin using aminoglycosides, a class of antibiotics that cause (as a side effect) a degree of mis-translation and hence allow "read-through" of the stop codon. You can see the slides from my Science Bomb here.

Monday, 11 May 2009

Neolithic origin of Campylobacter jejuni

As part of a recent trip to the University of Edinburgh to visit Andrew Rambaut, I gave a talk on some work of mine published in the February edition of Molecular Biology and Evolution and subsequently recommended on the Faculty of 1000 website about the evolution of the gut pathogen Campylobacter jejuni.

Part of the paper is concerned with the issue of the timescale of Campylobacter evolution, and using longitudinal samples of C. jejuni DNA sequences we attempted to calibrate the molecular clock in a similar way to that which is standard practice for viruses.

We detected surprisingly rapid evolution - 1,000 times faster than traditional estimates - which would place the split of C. jejuni from its closest relative C. coli during the Neolithic revolution. Interestingly, the point estimate of 6,500 years ago for the split from C. coli - which preferentially infects swine - coincides with the spread of pig domestication in the Near East and Europe in the 4th millennium BC.

The date is controversial because the traditional dating method, which is based on bounding deep phylogenetic splits such as the common ancestor of mitochondria and bacteria, would place the divergence of C. jejuni and C. coli closer to 10 million years ago.

After the seminar I had an interesting discussion with Paul Sharp, who was in the audience. Prof Sharp is actively researching the causes of conflict between long-term and short-term estimates of the rate of evolution in viruses. As he points out, short-term rate estimates (usually based on longitudinally-sampled viral sequences) frequently suggest that evolution is occurring much more rapidly than long-term estimates (based on deeper calibration points, such as co-phylogeny of host and pathogen). This phenomenon, observed in HIV and hepatitis C among others, may be caused by overly simplistic models of sequence evolution.

So how plausible is it that a ubiquitous bacterial pathogen such as C. jejuni evolved as recently as the Neolithic, possibly in response to changes brought about by agriculture or animal husbandry? Longitudinal studies of Helicobacter pylori and Neisseria gonnorhoeae have obtained similarly rapid rates of bacterial evolution, and evidence is mounting that the Neolithic revolution played an important role in creating new niches for human, plant and animal pathogens. Perhaps the best prospect for resolving these questions will be studies of ancient DNA preserved from the period in question.

Monday, 27 April 2009

omegaMap at BioHPC

All evolutionary biologists wishing to make use of omegaMap now have access to a high performance parallel computing cluster via the internet courtesy of Cornell's CBSU and Microsoft. The software, which allows the detection of selection and recombination in DNA or RNA sequences, can be run via the web interface at, or downloaded as part of the BioHPC suite.

The web interface consists of a simple form where users can upload their configuration file and sequences in FASTA format. Completed jobs are notified by e-mail. To learn more about the project visit the CBSU home page.

Meanwhile, I am working on several major updates to omegaMap, the most interesting of which will probably be the development of a new model that allows for the joint analysis of natural selection acting on sequences from different populations or species. The aim is to integrate population genetic and phylogenetic models of selection in order to exploit the signal of selection contained both in polymorphism within populations (or species) and divergence between them. I will be presenting progress on this work, in the context of hominid evolution, at the 2009 SMBE meeting in Iowa City this June.

Saturday, 3 January 2009

Human Evolution in New York City

Rounding off a hectic end to 2008 was a trip to visit Molly, currently on sabbatical in New York city. Joanna and I flew out to spend the final weekend before Christmas discussing projects and frequenting the local coffee shops, restaurants and bars. I took the opportunity to visit the American Museum of Natural History adjacent to Central Park after reading about its dinosaur collections in the Catcher in the Rye; pictured is an Allosaurus skeleton, which stands in the main entrance hall. Of particular interest was the Spitzer Hall of Human Origins which features a wealth of fossil remains and artefacts including a cast of the Laetoli footprints and a diorama of an Australopithecus afarensis nuclear family. Fittingly, the very focus of the New York trip was to discuss the on-going project to characterize natural selection between hominid species.

Thursday, 30 October 2008

Inferring niche membership from genetic diversity

Each Wednesday the Ecology and Evolution department run a journal club called Noon Illumination, and this week I volunteered to lead discussion on a recent article titled Resource Partitioning and Sympatric Differentiation Among Closely Related Bacterioplankton (Science 320: 1081-5), by Dana Hunt and colleagues based at MIT and Ghent. I originally prepared the presentation for a Bacterial Metagenomics workshop in Berlin this July, organized by Daniel Falush.

Of central interest in the paper is a novel methodology that infers habitat/niche based on ecological variables and DNA sequencing in the family of marine bacteria Vibrionaceae. That places it in the wider context of methods that attempt to predict phenotype (in this case niche) from genotype. Their approach is an elegant extension of familiar phylogenetic methods to model habitat switching over evolutionary time. Based on arguments put forward by Christophe Fraser and colleagues, the paper reasons that the ancestral habitat switches they detect are likely to be adaptive because the rate of recombination eclipses the mutation rate sufficiently to preclude the possibility of neutral genetic clustering.

However the high rate of recombination raises some difficulties of interpretation. The principal phylogenetic reconstruction was based on the hsp60 gene, but by sequencing other housekeeping genes, Hunt and colleagues found that in some cases, recombination between genes caused an artefactual habitat switch in the hsp60 ancestry that was not evident in the other genes. Using a permutation test, I found evidence for recombination within the vibrio hsp60 genes, which may confound the phylogenetic reconstruction of evolutionary relationships (Schierup and Hein 2000). On a more philosophical note, suppose you could directly observe ancestral habitat switches. Would that be strong evidence for adaptation? An association between habitat and genetic lineage is probably not sufficient to demonstrate the action of natural selection. On the other hand, frequent recombination could empower genome-wide scans for extreme association between genes and habitats, that would provide stronger support for adaptation.

You can view a PDF of the presentation of this stimulating article in our journal club here.

Friday, 26 September 2008

Tracing the source of campylobacteriosis

Finally, it's out! The main piece of work to come out of my two-year period as Research Associate at Lancaster University is published today in PLoS Genetics.

The article reports a study in Lancashire, England, of the bacterium Campylobacter jejuni, the primary cause of bacterial gastro-enteritis in developed countries. We inferred the source of infection in 1,200 patients by comparing the DNA sequences of C. jejuni taken from those patients to 1,100 taken from different animal species and the environment. The result: livestock are the source of infection in 97% of cases.

In addition to preparing the figures, approving final drafts, and producing a press release in conjunction with the PLoS Genetics and university press offices, I have spent much of my time over the last three weeks revising a companion paper on the evolution of C. jejuni. On Friday that was resubmitted to Molecular Biology and Evolution, and should it be accepted, will draw a line under my Lancaster projects.

Monday, 4 August 2008

Visit to Kilifi

The final days in Paris were taken up by revisions to a paper submitted in April to PLoS Genetics. With luck the revisions will be accepted and that paper will be coming out soon. From Paris I flew to Mombasa, Kenya to begin a 3 week collaboration with Caroline Buckee at the KEMRI-Wellcome Trust research unit in Kilifi. Caroline, Pete Bull and others at Kilifi work on the evolution of var genes in Plasmodium falciparum, the most common and lethal agent of malaria. The var genes encode a family of proteins expressed by the pathogen on the membrane of infected red blood cells. Implicated in pathogenesis, these genes are highly diverse in order to evade the host immune system. The first step in piecing together their evolutionary history is to align the sequences - a task made difficult by the abundance of insertions and deletions. From there we hope to characterize the relative importance of gene duplication and homologous and non-homologous recombination in driving the evolution of these genes.

Tuesday, 15 July 2008

Exchange with the Institut Pasteur

My position at the University of Chicago is funded by a grant awarded to Molly by the National Institutes of Health (NIH) to detect the signature that natural selection has left on the human genome. One of our collaborations is with the human genetics lab of Lluis Quintana-Murci at the Institut Pasteur in Paris, and I'm making the first of a number of exchange visits to strengthen the ties between two labs.

Lluis' group are interested in the selection pressure that pathogens have exerted on the human genome, and in particular a family of genes involved in the immune system known as toll-like receptors. The idea is that together we develop methods to detect and quantify selection in these genes by comparison to neutral regions in individuals from around the world. Among the people involved in the project is Luis Barreiro, a post-doc who has just arrived in the Human Genetics department at Chicago from the Institut Pasteur.

In Paris I've been participating in group meetings, offering my opinion on manuscripts coming out of Lluis' lab, and discussing with lab members how we might analyze the DNA sequence data they're producing. I've also been acquainting myself with the produce of Château des Ravatys (the Institut Pasteur vinyard) and celebrating the 14 Juillet (photo).

What do researchers do?

Most of my friends (and maybe even my colleagues) don't have much idea of what I do day-to-day. That's the subject of this blog. Modern research is quantified in terms of published articles, but that forms just a small part of daily working life, and the process from conceiving an interesting question or idea to published article is a long and tortuous one. Often articles aren't published until months or years after the research was conducted, and so there is a time-lag between what appears in print and what the current focus of research is. By that time the initial excitement of pursuing novel work is a distant memory, having been supplanted by several rounds of peer review, revision and resubmission.

So this is an attempt to communicate what I am doing now, to talk about what I think's exciting this week (or this afternoon) and to show what being a researcher means on a daily basis.