Monday, 7 January 2019

New paper in PNAS: harmonic mean p-value

Published on Friday in Proceedings of the National Academy of Sciences USA, "The harmonic mean p-value for combining dependent tests" reports a new method for performing combined tests. A revised R package with detailed examples is now available online as the harmonicmeanp package on CRAN.

The method has two stages:
  • Compute a test statistic: the harmonic mean of the p-values (HMP) of the tests to be combined. Remarkably, this HMP is itself a valid p-value for small values (e.g. below 0.05).
  • Calculate an asymptotically exact p-value from the test statistic using generalized central limit theorem. The distribution is a type of Stable distribution first described by Lev Landau.
The method, which controls the strong-sense family-wise error rate (ssFWER), has several advantages over existing alternatives to combining p-values:
  • Combining p-values allows information to be aggregated over multiple tests and requires less stringent significance thresholds.
  • The HMP procedure is robust to positive dependence between the p-values, making it more widely applicable than Fisher's method which assumes independence.
  • The HMP procedure is more powerful than the Bonferroni and Simes procedures.
  • The HMP procedure is more powerful than the Benjamini-Hochberg (BH) procedure, even though BH only controls the weaker false discovery rate (FDR) and weak-sense family-wise error rate (wsFWER) in the sense that whenever the BH procedure detects one or more significant p-values, the HMP procedure will detect one or more significant p-values or groups of significant p-values.
The ssFWER can be considered gold-standard control of false positives because it aims to control the probability of one or more false positives even in the presence true positives. The HMP is inspired by Bayesian model averaging and approximates a model-averaged Bayes factor under certain conditions.

In researching and revising the paper, I looked high and low for previous uses of the harmonic-mean p-value because most ideas have usually been had already. Although there is a class of methods that use different types of average p-value (without compelling motivation), I did not find a precedent. Until today, a few days too late, so I may as well get in there and declare it before anyone else. I. J. Good published a paper in 1958 that mysteriously appeared when I googled the new publication on what he called the "harmonic mean rule-of-thumb", effectively for model-averaging. Undeniably, I did not do my homework thoroughly enough. Still, I would be interested if others know more about the history of this rule-of-thumb.

Good's paper, available on Jstor, proposes that the HMP "should be regarded as an approximate tail-area probability" [i.e. p-value], although he did not propose the asymptotically exact test (Eq. 4) or the multilevel test procedure (Eq. 6) that are important to my approach. His presentation is amusingly apologetic, e.g. "an approximate rule of thumb is tentatively proposed in the hope of provoking discussion", "this rule of thumb should not be used if the statistician can think of anything better to do" and "The 'harmonic-mean rule of thumb' is presented with some misgivings, because, like many other statistical techniques, it is liable to be used thoughtlessly". Perhaps this is why the method (as far as I could tell) had disappeared from the literature. Hopefully the aspects new to my paper will shake off these misgivings and provide users with confidence that the procedure is interpretable and well-motivated on theoretical as well as empirical grounds. Please give it a read!

Work cited
  • R. A. Fisher (1934) Statistical Methods for Research Workers (Oliver and Boyd, Edinburgh), 5th Ed.
  • L. D. Landau (1944) On the energy loss of fast particles by ionization. Journal of Physics U.S.S.R. 8: 201-205.
  • I. J. Good (1958) Significance tests in parallel and in series. Journal of the American Statistical Association 53: 799-813. (Jstor)
  • R. J. Simes (1986) An improved Bonferroni procedure for multiple tests of significance. Biometrika 73: 751-754.
  • Y. Benjamini and Y. Hochberg (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 57: 289-300.
  • D. J. Wilson (2019) The harmonic mean p-value for combining dependent tests. Proceedings of the National Academy of Sciences U.S.A. published ahead of print January 4, 2019. (PNAS)

Wednesday, 18 July 2018

Bacterial Doubling Times in the Wild

How fast do bacteria grow outside the laboratory? This simple question is very difficult to address directly, because it is near-impossible to track a lineage of bacterial cells, ancestor-to-decendant, inside an infected patient or through a river. Now in new work published in Proceedings B, Beth Gibson, Ed Feil, Adam Eyre-Walker and I exploit genome sequencing to try to get a handle on the problem indirectly.

We have done it by comparing two known quantities and taking the ratio: the rate at which DNA mutates in bacteria per year, and the rate it mutates per replication. This tells us in theory how many replications there are per year.

The mutation rate per replication has long been studied in the laboratory, and is around once per billion letters. Meanwhile, the recent avalanche of genomic data has allowed microbiologists to quantify the rate at which bacteria evolve over short time scales such as a year, including during outbreaks and even within individual infected patients. Most bugs mutate about once per million letters per year, with ten-fold variation above and below this not uncommon among different species.

For five species both these quantities exist. The fastest bug we looked at causes cholera and we estimate it doubles once every hour on average (give or take 30 minutes). The slowest was Salmonella, which we estimate doubles once a day on average (give or take 8 hours). In between were Staph. aureus and Pseudomonas at about two hours each, and E. coli at 15 hours. These are average over the very diverse and often hostile conditions that a bacterial cell may find itself in during the course of its natural lifecycle. To find out more about the work, please check out the paper.

Friday, 29 June 2018

PhD Studentship: Genomic prediction of antimicrobial resistance spread

This position is now closed
An opportunity has arisen for a D.Phil. (Ph.D.) place on the BBSRC-funded Oxford Interdisciplinary Bioscience Doctoral Training Partnership in the area of Artificial Intelligence, specifically Predicting the spread of antimicrobial resistance from genomics using machine learning.

If successful in a competitive application process, the candidate will join a cohort of students enrolled in the DTP’s one-year interdisciplinary training programme, before commencing the research project and joining my research group at the Big Data Institute.

This project addresses the BBSRC priority area “Combatting antimicrobial resistance” by using ML to predict the spread of antimicrobial resistance in human, animal and environmental bacteria exemplified by Escherichia coli. Understanding how quickly antimicrobial resistance (AMR) will spread helps plan effective prevention, improved biosecurity, and strategic investment into new measures. We will develop ML tools for large genomic datasets to predict the future spread of AMR in humans, animals and the environment. The project will create new methods based on award-winning probabilistic ML tools pioneered in my group (BASTA, SCOTTI) by training models using genomic and epidemiological data informative about past spread of AMR. We will apply the tools collaboratively to genomic studies of E. coli in Kenya, the UK and across Europe from humans, animals and the environment, Enterobacteriaceae in North-West England, and Campylobacter in Wales. Genomics has proven effective for asking “what went wrong” in the context of outbreak investigation and AMR spread; here we will address the greater challenge of repurposing such information using ML for forward prediction of future spread of AMR. Scrutiny will be intense because future predictions can and will be tested, raising the bar for the biological realism required while producing computationally efficient tools.

Attributes of suitable applicants: Understanding of genomics. Interest in infectious disease. Some numeracy, e.g. mathematics A-level, desirable. Experience of coding would help.

Funding notes: BBSRC eligibility criteria for studentship funding applies ( Successful students will receive a stipend of no less than the standard RCUK stipend rate, currently set at £14,777 per year.

How to apply: send me a CV and brief covering letter/email (no more than 1 page) explaining why you are interested and suitable by the Wednesday 11 July initial deadline. I will invite the best applicant/s to submit with me a formal application in time for the Friday 13 July second-stage deadline.

Wednesday, 27 June 2018

Royal Society Summer Science Exhibition Stall July 2-8

Next week researchers from the Modernising Medical Microbiology consortium, collaborating groups and I will exhibit the Resistance is Futile stall at the Royal Society Summer Science Exhibition. The exhibition is a free event in central London open to all visitors. Our stall is an opportunity to tell visitors about our research, and how advances in genetics are influencing day-to-day life. On show at the Resistance is Futile stall:

    Oxford Nanopore Technology Demos
      DNA sequencing in the NHS is shortening the time to diagnose antibiotic resistance in serious infections

        Evolution Dance Mat
          Resistance mutants arise spontaneously through chance copying errors during DNA replication

            Antibiotic Resistance Coconut Shy
              Antibiotic use gives resistance mutants a strong advantage so they rapidly increase in frequency.

              The exhibition runs from Monday 2 July - Sunday 8 July at Carlton House Terrace, London, SW1Y 5AG. For more information about our stall click here and for general visitor information about the exhibition click here. Please spread the word!

              During the exhibition we will be tweeting from @ResistanceIF

              Our stall is generously supported by Oxford Nanopore Technology, the Nuffield Department of Medicine, and through public engagement research funding awarded to our research groups by the Wellcome Trust, the Royal Society, the National Institute for Health Research, the Oxford Biomedical Research Centre, the Natural Environment Research Council, the Medical Research Council, the Newton Fund and the Bill & Melinda Gates Foundation.

              Friday, 18 May 2018

              Postdoc positions in Data Science and Molecular Microbiology

              These positions are now closed
              As part of the move to the Big Data Institute, two new postdoctoral positions funded by the Robertson Foundation are available in Data Science and Molecular Microbiology.

              The BDI is a new interdisciplinary research centre aiming to develop, evaluate and deploy efficient methods for acquiring and analysing biomedical data at scale and for exploiting the opportunities arising from such studies. The BDI is a joint venture between the renowned Nuffield Department of Population Health (NDPH) and NDM.

              The Data Scientist role, split between the BDI and London, will be part of a team developing systems for continuous record linkage between Public Health England and other population health records. The aims are to design record linkage algorithms, manage front ends for viewing the data source, and analyse and interpret results. We're looking for a graduate or equivalent experience in computer science, data science, statistics, or any other relevant subject with a strong quantitative component. Knowledge of databases like SQL and computer programming are needed.

              The Molecular Microbiology role, based mainly at the John Radcliffe Hospital Microbiology Department, will be part of a team researching Staphylococcus aureus infection using RNA sequencing, genome wide association studies, and biochemical and immunological assays of bacterial behaviour. The aims include designing microbiological protocols, researching bacterial molecular genetics and data analysis. We're looking for a PhD or equivalent experience in a relevant subject such as microbiology, immunology, genetics or biochemistry. Experience designing protocols and basic microbiological and immunological skills are required.

              The deadline for the posts is Noon on 6 June 2018. Both are one year positions. For more details or to apply click here for the Data Scientist role and here for the Molecular Microbiologist role.

              The group has moved to the Big Data Institute, University of Oxford

              From April we have moved to the Big Data Institute, Nuffield Department of Population Health at the University of Oxford. The group is maintaining its close links to the Modernising Medical Microbiology Consortium and the John Radcliffe Hospital, Oxford. I am grateful to the Robertson Foundation for funding. We're excited about joining new colleagues and benefiting from their expertise in epidemiology, health informatics, genetics and infection, while continuing to cultivate strong links with our existing collaborators in Oxford and around the world.

              Sunday, 31 December 2017

              New paper: Severe infections emerge from commensal bacteria by adaptive evolution

              Published this month in eLife, our new paper on the evolution and adaptation of Staphylococcus aureus during infection.

              This study shows that the emergence of life-threatening infections of the major pathogen Staphylococcus aureus from bacteria colonizing the nose is associated with repeatable adaptive evolution inside the human body.

              First author Bernadette Young has summarized the paper's findings on the Modernising Medical Microbiology blog.

              Monday, 18 December 2017

              SCOTTI wins PLoS Computational Biology Research Prize

              Work from our group has been recognised in the PLoS Computational Biology 2017 Research Prizes. SCOTTI, which infers transmission routes from genetic and epidemiological information, won the Breakthrough in Advance/Innovation category. The citation reads
              Our Breakthrough Advance/Innovation winning article presents a new computational tool, called SCOTTI (Structured COalescent Transmission Tree Inference), developed by Nicola De Maio of the University of Oxford (UK), and colleagues. De Maio says, “SCOTTI represents a convenient tool to reconstruct who-infected-whom within outbreaks… [and] has been used in particular for the study of bacterial hospital outbreaks”. It combines epidemiological information about patient exposure with genetic information about the infectious agent itself.
              Work is nominated and selected as described in the announcement:
              The journal invited the community to nominate their favorite 2016 published Research Articles. From these nominations the PLOS Computational Biology Research Prize Committee, made up of Editorial Board members Dina Schneidman, Nicola Segata, Maricel Kann, Isidore Rigoutsos, Avner Schlessinger, Lilia Iakoucheva, Ilya Ioshikhes, Shi-Jie Chen, and Becca Asquith, selected the winners. To help support future work, the authors of each winning paper will receive award certificates and a $2,000 (USD) prize.
              You can read more about SCOTTI and the accompanying paper, written by Nicola De Maio, Jessie Wu and me, here.

              Monday, 11 September 2017

              Promiscuous bacteria have staying power

              An insight article with Ruth Massey on John Lees' and Stephen Bentley's new paper was published in eLife on Friday:

              Streptococcus pneumoniae is a notorious bacterial pathogen hiding in plain sight. A common resident of the nose and throat, between 68% and 84% of young infants will carry this species at any given time (Turner et al., 2012). In most cases it causes no harm, yet the presence of pneumococci – as the bacteria are known – can predispose a person to life-threatening infections like pneumonia or meningitis. Indeed, pneumococci are responsible for around 10% of all deaths in young children around the world (O'Brien et al., 2009), with the vast majority of cases being in developing countries.
              Research into S. pneumoniae is complicated because the species is a patchwork of distinctive strains and some of these strains remain in the nose and throat for longer than others. Now, in eLife, John Lees and Stephen Bentley – both at the Wellcome Trust Sanger Institute – and colleagues report that strains rendered impotent by a virus do not linger for as long as other strains (Lees et al., 2017).

              Click here to read the full piece.

              Thursday, 3 August 2017

              New draft paper on combining p-values through the harmonic mean

              In a preprint released today on Biorxiv I report a new method for improving the sensitivity to detect statistical signals by averaging over multiple alternative hypotheses using the harmonic mean p-value. The draft paper looks at example problems in genome-wide association studies (GWAS) in which signals of association may be apparent, but perhaps not sufficiently strong to meet the stringent threshold required to control for the millions of tests performed. Combining weak signals in arbitrary ways - for example across consecutive variants - can reveal signals sufficiently strong to meet the statistical significance threshold. This could be especially useful when looking for interactions, for example between host and pathogen genetics in their effect on infection, because it may be possible to conclude that a particular variant on the host side is involved, even if there is uncertainty over the specific pathogen variant it interacts with. Often such uncertainty arises because of the sheer number of possibilities. Similar ideas are beginning to gain traction in GWAS, and the ability to easily average over hypotheses is one of the strengths of Bayesian statistics. This new paper shows that the benefits of model averaging can be achieved easily in non-Bayesian statistics by taking the harmonic mean p-value from a range of tests. The test is very general and robust to a range of complexities including non-independence between the p-values.

              Thursday, 29 September 2016

              New paper: SCOTTI Efficient reconstruction of transmission within outbreaks with the structured coalescent

              New paper published today in PLoS Computational Biology: Understanding how infectious disease spreads and where it originates is essential for devising policies to prevent and limit outbreaks. Whole genome sequencing of pathogens has proved an extremely promising tool for identifying transmission, particularly when combined with classical epidemiological data. Several statistical and computational approaches are available for exploiting genomics for epidemiological investigation. These methods have seen applications to dozens of outbreak studies. However, they have a number of serious drawbacks.

              In this new paper Nicola De Maio, Jessie Wu and I introduce SCOTTI, a method for quickly and accurately inferring who-infected- whom from genomic and epidemiological data. SCOTTI addresses very widespread, but generally neglected problems in joint epidemiological and genomic inference, notably the presence of non-sampled and undetected intermediate cases and within-host pathogen variation caused by microevolution. Using real examples and simulations, we show that these problems cause strong misleading effects on existing popular inference methods. SCOTTI is based on BASTA, our recent breakthrough method for phylogeographic inference, and offers new standards of accuracy, calibration, and computational efficiency. SCOTTI is distributed as an open source package within BEAST2.

              Friday, 23 September 2016

              Prize PhD Studentships available

              I am offering two PhD projects as part of the annual Nuffield Department of Medicine Prize Studentship competition:
              These are fully-funded, four-year awards open to outstanding students of any nationality. Applicants nominate three projects, in order of preference, from the available pool. For how to apply, click here. Only applications submitted through the online system will be considered, but interested applicants are welcome to contact me informally. The deadline for applications is noon, 6th January 2017.

              In addition to my projects, the Modernising Medical Microbiology project has announced the following PhD projects as part of the competition:

              Friday, 19 August 2016

              The Rsp virulence regulator: new review in Trends in Microbiology

              In the September issue of Trends in Microbiology, Mark Smeltzer casts the spotlight on the story of rsp, a virulence regulator in Staphylococcus aureus that evolves within infected patients and may play a role in disease.

              The new review covers recent work on the rsp gene including a series papers that my collaborators and my group have contributed: 
              Natural mutations in a Staphylococcus aureus virulence regulator attenuate cytotoxicity but permit bacteremia and abscess formation.
              Das, S., Lindemann, C., Young, B. C., Muller, J., Österreich, B., Ternette, N., Winkler, A.-C., Paprotka, K., Reinhardt, R., Förstner, K. U., Allen, E., Flaxman, A., Yamaguchi, Y., Rollier, C. S., Van Diemen, P., Blättner, S., Remmele, C. W., Selle, M., Dittrich, M., Müller, T., Vogel, J., Ohlsen, K., Crook, D., Massey, R., Wilson, D. J., Rudel, T., Wyllie, D. H., and M. J. Fraunholz (2016)
              Proceedings of the National Academy of Sciences USA 113: E3101–E3110. (abstract pdf)

              Evolutionary trade-offs underlie the multi-faceted virulence of Staphylococcus aureus.
              Laabei, M., Uhlemann, A.-C., Lowy, F. D., Austin, E. D., Yokoyama, M., Ouadi, K., Feil, E., Thorpe, H. A., Williams, B., Perkins, M., Peacock, S. J., Clarke, S. R., Dordel, J., Holden, M., Votintseva, A. A., Bowden, R., Crook, D. W., Young, B. C., Wilson, D. J., Recker, M. and R. C. Massey (2015)
              PLoS Biology 13: e1002229. (abstract pdf)

              Evolutionary dynamics of Staphylococcus aureus during progression from carriage to disease.
              Young, B. C., Golubchik, T., Batty, E. M., Fung, R., Larner-Svennson, H., Votintseva, A., Miller, R. R., Godwin, H., Knox, K., Everitt, R. G., Iqbal, Z., Rimmer, A. J., Cule, M., Ip C. L. C., Didelot, X., Harding, R. M., Donnelly, P. J., Peto, T. E., Crook, D. W., Bowden, R. and D. J. Wilson (2012)
              Proceedings of the National Academy of Sciences USA 109: 4550-4555. (abstract pdf F1000)

              Wednesday, 3 August 2016

              Lyre bird song

              Healesville Sanctuary, Yarra Valley, Victoria, Australia

              Wednesday, 29 June 2016

              Brits will support Remain by 2021 even if no-one changes their mind

              There has been much discussion as to the disconnect between parliamentary support for the EU and the public's vote to leave in the referendum. However, the referendum result was strongly influenced by a striking effect of age, with older voters anti-EU in contrast to a pro-EU trend among the young, and disproportionate over-representation of older voters at the polling booths. Because of this striking age structure, the UK is destined to become more pro-EU over the years even if no-one changes their mind simply because of population turnover. Based on a very simple analysis (click here for the R code), the UK would lean towards preferring to remain in the UK within the next parliament. The UK would barely be extricated from the EU before it wanted to rejoin! Surely this is an overwhelming reason for MPs to take responsibility to vote down Brexit in the interest of future generations.

              Friday, 17 June 2016

              Collaborative PhD and postdoc positions available

              Dr Nicole Stoesser, Prof. Derrick Crook, myself and colleagues in Oxford are seeking a postdoc in Microbial Genomics with statistics skills to join a new three-year project investigating antimicrobial resistance in environmental, human and animal reservoirs of E. coli and related organisms. The application deadline is noon Monday 11th July. For more details click here.

              Dr Pierre Mahe of bioMérieux in Grenoble, France, is seeking to appoint an industry-linked PhD position developing statistical methods for genome-based characterization of antimicrobial resistance and virulence genes, with a focus on the opportunistic pathogen Pseudomonas aeruginosa. The position involves a secondment here in Oxford. For more details click here or contact Pierre Mahe.

              Tuesday, 17 May 2016

              New paper: How low-toxic Staph. aureus mutants cause severe infections

              Published today in PNAS Early Edition, our new paper that reveals naturally occurring mutations in the poorly-described rsp gene of Staph. aureus
              reduce toxicity while maintaining the ability to survive, proliferate and cause infection within the human body.

              In previous work, we have found that Staph. aureus evolves by mutation within the body quickly enough to influence the progression of disease, and that diversity generated by evolution in the body is a widespread phenomenon. In the case of one patient who we followed longitudinally for over a year, we identified that bacteria in the bloodstream differed from those in the nose by several mutations, of which a loss-of-function mutation in the rsp regulatory gene represented the most likely candidate for playing a possible role in causing severe infection.

              We collaborated with Ruth Massey at Bath who discovered to our surprise that while rsp loss-of-function mutants do indeed show differences in toxicity - one of several traditional correlates of virulence readily measured in the laboratory - they showed reduced toxicity. Going further, Ruth and her collaborators showed that bloodstream infections in general show reduced toxicity compared to milder skin infections and asymptomatically carried nose populations, overturning previous views on the relationship between Staph. aureus toxicity and virulence.

              Today's new paper offers a detailed dissection of rsp. Working with Claudia Lindemann and David Wyllie at the University of Oxford and Martin Fraunholz and collaborators at the University of Würzburg, we found that although rsp mutants show reduced toxicity, crucially they retain their capacity to survive, grow, spread through the body and cause abscesses. In other words, rsp uncouples toxicity from pathogenicity. This decoupling could be important for evading the immune system and establishing severe infections. To find out more, see the full paper.

              Tuesday, 12 April 2016

              Postdoctoral Scientist in Statistical Genomics

              We are recruiting for a Postdoctoral Scientist in Statistical Genomics working on Antimicrobial Resistance (AMR) gene discovery and focused on Tuberculosis. This will be a joint position at the University of Oxford between Derrick Crook's group and mine, and part of the large international CRyPTIC consortium.

              The role is for a population geneticist or statistical geneticist to develop and apply statistical methods, including genome-wide association studies, for discovering rare and common genetic variants underlying antimicrobial resistance in Mycobacterium tuberculosis.

              One third of the world's population - 2.5 billion people - are thought to be infected with tuberculosis (TB). This post offers an opportunity to work with global TB experts from five continents, statistical geneticists, clinicians, medical statisticians and software engineers; integrating statistical genetics, bioinformatics and machine learning methods with the aim of uncovering all genomic variants causing at least 1% resistance to first line anti-TB drugs.

              We're looking for candidates with a PhD in genomics, evolutionary biology, statistics or a related subject. The post is full-time and fixed-term for up to 3 years initially.

              The deadline for applications is noon on Friday 6th May 2016.

              Thursday, 7 April 2016

              Making the most of bacterial GWAS: new paper in Nature Microbiology

              In a new paper published this week in Nature Microbiology, we report the performance of genome wide association studies (GWAS) in bacteria to identify causal mechanisms of antibiotic resistance in four major pathogens, and introduce a new method, bugwas,  to make the most of bacterial GWAS for traits under less strong selection.

              As explained by Sarah Earle, joint first author with Jessie Wu and Jane Charlesworth, the problem with GWAS in bacteria is strong population structure and the consequent strong coinheritance of genetic variants throughout the genome. This phenomenon - known as genome-wide linkage disequilibrium (LD) - comes about because exchange of genes is relatively infrequent in bacteria, which reproduce clonally, compared to organisms that exchange genes every generation through sexual reproduction.

              Genome-wide LD makes it difficult for GWAS to distinguish variants that causally influence a trait from other, coinherited variants that have no direct effect on the trait.

              In the case of antibiotic resistance - a trait of high importance to human health - bacteria are under extraordinary selection pressures because resistance is a matter of life and death, to them as well as their human host. This helps overcome coinheritance and pinpoint causal variants because antibiotic usage selects for the independent evolution of the same resistance-causing variants in different genetic backgrounds.

              Consequently, bacterial GWAS works very efficiently for antibiotic resistance: the variants most significantly associated with antibiotic resistance in 26 out of the 27 GWAS we performed were genuine resistance-conferring mutations. In the 27th we uncovered a putative novel mechanism of resistance to cefazolin in E. coli. These results for 17 antibiotics (ampicillin, cefazolin, cefuroxime, ceftriaxone, ciprofloxacin, erythromycin, ethambutol, fusidic acid, gentamicin, isoniazid, penicillin, pyrazinamide, methicillin, rifampicin, tetracycline, tobramycin and trimethoprim) across four species (E. coli, K. pneumoniae, M. tuberculosis and S. aureus) build on earlier work investigating beta-lactam resistance in S. pneumoniae, and convincingly demonstrate the potential for bacterial GWAS to discover new genes underlying important traits under strong selection.

              What about traits under less strong selection, which probably includes pretty much every other bacterial trait? We show in this context that coinheritance poses a major challenge, based on detailed simulations. Often it may not be possible to use GWAS to pinpoint individual variants responsible for different traits because they are coinherited with - possibly many - other uninvolved variants.

              But all is not lost. We show that even when individual locus-level effects cannot be pinpointed, there is often excellent power to characterize lineage-level differences in phenotype between strains. This is helpful for multiple reasons: (1) we often conceptualize trait variability in bacteria at the level of strain-to-strain differences (2) these differences can be highly predictive (3) we can prioritize variants for functional follow-up based on their contribution to strain-level differences.

              These concepts represent a substantial departure from regular GWAS. In the human setting for instance, lineage-level differences are usually discarded as uninteresting or artefactual, and variants are almost always prioritized based on statistical evidence for involvement over-and-above any contribution to lineage-level differences. In the bacterial setting, we are forced to depart from these conventions because a large proportion of all genetic variation is strongly strain-stratified. To find out more, see the paper and try our methods.

              Wednesday, 30 March 2016

              CRyPTIC: rapid diagnosis of drug resistance in TB

              The Modernising Medical Microbiology consortium has announced a new worldwide collaboration called CRyPTIC to speed up diagnosis of antibiotic resistant tuberculosis (TB).

              TB infects nearly 10 million people each year and kills 1.5 million, making it one of the leading causes of death worldwide. Almost half a million people each year develop multidrug-resistant (MDR) TB, which defies common TB treatments. Time consuming tests must be run to identify MDR-TB and which drugs will work or fail. This delays diagnosis and creates uncertainty about the best drugs to prescribe to individual patients.

              CRyPTIC aims to hasten the identification of MDR-TB using whole genome sequencing to identify genetic variants that give resistance to particular drugs. The project is funded by a $2.2m grant from the Bill & Melinda Gates Foundation and a £4m grant from the Wellcome Trust and MRC Newton Fund.

              CRyPTIC aims to collect and analyse 100,000 TB cases from across the world, providing a database of MDR-TB that will underpin diagnosis using WGS. Samples from across Africa, Asia, Europe and the Americas will be collected by teams at more than a dozen centres They will conduct drug resistance testing and much of the genome sequencing. Read more information here.

              Saturday, 5 March 2016

              Snow Monkeys in Japan

              Recently got back from the SMBE Satellite meeting on Pathogen Genomics in Japan. The organizers did a fantastic job and the talks were great. There was also time to visit the Japanese macaques at Snow Monkey Park, where one of the little guys climbed on to my shoulders
              Thanks Ashlee Earl for the video and Koji Yahara, Alan McNally and Nick Croucher for additional commentary!

              Wednesday, 20 January 2016

              Nature Reviews Microbiology: Within-host evolution of bacterial pathogens

              Our new review of what genomics has taught us about Within-host evolution of bacterial pathogens has been published in Nature Reviews Microbiology.

              Friday, 9 October 2015

              PLoS Biology: Staphylococcus aureus invading the blood are less toxic

              Toxicity in nose, blood and skin bacteria.
              Collaborative work with Ruth Massey's group at the University of Bath taking forward a study of within-host evolution of Staphylococcus aureus during infection has been published in PLoS Biology. Previously we reported in PNAS that in one patient, bacteria causing a serious bloodstream infection differed by just eight mutations from a persistently carried nose population. We identified one of those mutations as playing a potentially causative role in transforming the nose bacteria into a form capable of bloodstream infection - a regulatory protein called rsp. To investigate further, Ruth applied a number of tests to characterize bacteria taken prior to and during infection. In this new paper, we report the surprising result that the bloodstream isolates show reduced toxicity and that rsp is the responsible for this change.

              The notion that isolates responsible for serious human infection are less toxic challenges some long-held beliefs about the mechanism of disease in Staphylococcus aureus infections. Most models of disease assume a straightforward relationship between increased toxicity and greater virulence - the propensity to cause, or severity of, disease.

              To test her observation, Ruth collaborated with groups from New York and Cambridge to investigate whether the pattern observed in one patient held more generally across 134 Staphylococcus aureus belonging to the notorious USA300 strain. It did.

              Curiously, bacteria isolated from the skin and from superficial infections were equally toxic to nose bacteria. These findings raise new questions about the role of toxicity in colonization, transmission and serious infections of Staphylococcus aureus. One possibility that we wish to investigate further is whether toxicity might be required for the usual transmission of Staphylococcus aureus populations in the nose, skin or superficial infections (such as impetigo), whereas loss of toxicity may promote transition to deep tissue and bloodstream infections by evading immune defences.

              Tuesday, 8 September 2015

              New paper: Rapid host switching in Campylobacter

              Our new open access paper Rapid host switching in generalist Campylobacter strains erodes the signal for tracing human infections was published last week in the ISME Journal.

              Figure from paper 
              With Bethany Dearlove, Sam Sheppard and colleagues, we investigated common strains of campylobacter, the most frequent cause of bacterial gastroenteritis worldwide. Campylobacter infection is associated with food poisoning, particularly contaminated chicken. But in previous work, we found that certain strains (the ST-21, ST-45 and ST-828 complexes) are often found contaminating a range of meat and poultry, making it difficult to trace the source of human infection.

              That previous work was based on partial genome sequencing known as MLST. In MLST, less than 1% of the information in the genome is captured. Now that whole genome sequencing is available, the expectation was that we should be able to distinguish easily between between ST-21, 45 and 828 strains contaminating poultry versus beef versus lamb, and so on.

              What we found was surprising. Instead of these strains harbouring previously unobserved sub-structure that allowed them to be associated with different animal sources, we found rapidly mixing populations undergoing extremely fast transmission between animal species, with campylobacter strains ricocheting among animal species on a timescale of just a few years. This is faster than they can accumulate enough mutations to differentiate populations colonizing different animal species.

              Our results present an unforeseen roadblock to tracing transmission with whole genome sequencing, and suggests these strains are adapted to a generalist lifestyle, shedding new light on the ecology of this pathogen. These findings push back against the tide of opinion that whole genome sequencing is necessarily a panacea for detecting transmission, and demonstrate that going forwards, a detailed understanding of the biology of zoonotic bacteria (those transmitting between multiple species) and intensive sampling of potential sources are essential for effectively tracing the source of human infection.

              Monday, 17 August 2015

              BASTA: Improved method for phylogeography

              This week sees publication of our paper New Routes to Phylogeography: a Bayesian Structured Coalescent Approximation in PLoS Genetics.

              Phylogeography is the recovery of migration history from genome sequences, and has exploded as a field in recent years. Over a thousand papers have used contemporary sequences and ancient DNA to reconstruct migratory trends, locate the origin of outbreaks and track the spread of infectious diseases. In many high profile examples phylogeography has informed our understanding of how major human pathogens spread.

              In our new paper we solve a severe and apparently widely unappreciated problem: that the most popular approaches to phylogeography are heavily biased, extremely sensitive to sampling structure and substantially underestimate statistical uncertainty. The problems stem from the treatment of migration as equivalent to mutation (discrete trait analysis; DTA), and the assumption that sampling locations are phylogeographically informative.

              To solve these problems we introduce and demonstrate a new method BASTA, implemented in the phylogenetic software package BEAST2, that employs a novel approximation to enable inference under the structured coalescent – the bottom-up population genetics model of migration. Previously, methods for exact inference under the structured coalescent have proven too slow for many practical purposes, hence the need for a fast and accurate approximation.

              The biases we highlight with popular phylogeography methods are much more important than might appear from what is at one level a question of model choice. To underline this, we present an analysis of around 100 Ebola virus genome sequences to investigate the emergence of human outbreaks. Epidemiological studies have found that animals act as a reservoir, maintaining the virus between the sporadic human outbreaks that have unfolded over the past four decades, a scenario that our structured coalescent-based model correctly identifies.

              Remarkably, DTA, the de facto standard method for phylogeography, wrongly concluded with high confidence that Ebola has been maintained since 1976 by undetected human-to-human transmission between outbreaks. Although such a conclusion would never be believed in the case of Ebola, it makes clear the potential for highly misleading inference about transmission that could, for much less well understood diseases, have serious implications for public health policy.

              BASTA is the result of a lot of hard work by Nicola De Maio, who is a James Martin Fellow at the Oxford Martin School Institute for Emerging Infections, with help from Jessie Wu and Kathleen O'Reilly. You can read the paper here and download BASTA here.

              Friday, 24 July 2015

              New Journal: Microbial Genomics

              This week sees the launch of Microbial Genomics, a new open access journal from the Society for General Microbiology. Here's an excerpt from the journal's mission statement:

              "Microbial Genomics (MGen) publishes high quality, original research on archaea, bacteria, microbial eukaryotes and viruses. MGen welcomes papers that use genomic approaches to understand microbial evolution, population genomics and phylogeography, outbreaks and epidemiological investigations, impact of climate or changing niche, metagenomic and whole transcriptome studies, and bioinformatic analysis covering the breadth of microbiology, from clinically important pathogens to microbial life in diverse ecosystems."

              The journal, whose tag line is Bases to Biology, will publish microbiological discoveries and innovations in research methods and bioinformatics. The journal is headed by renowned Wellcome Trust Sanger Institute scientists Stephen Bentley and Nicholas Thompson with an impressive editorial board that I joined earlier this year. Article processing charges have been waived during the journal's launch year - so get in there fast!

              Tuesday, 21 July 2015

              Resistance is Futile: Science Museum Lates and Cheltenham Science Festival

              Some photos from this summer's Science Museum Lates event with the Royal Society and the Cheltenham Science Festival. Thanks to everyone who helped: Liz Batty, Phelim Bradley, Jane Charlesworth, Dilly De Silva, Sarah Earle, Nicki Fawcett, Jess Hedge, Brian Mackenwells, Amy Mason, Charvy Narain, Anna Sheppard and Jessie Wu!

              Science Museum Lates: The next big thing Science Museum Lates: The next big thing Science Museum Lates: The next big thing

              We had two activities. Dance Dance Evolution is a computer game which uses an adapted dance-dance mat with four squares representing bases in the DNA (A, C, G and T). Participants act as the DNA replicator, and mistakes cause mutations in the DNA sequence. The next dancer copies the sequence left by the previous dancer, demonstrating evolution by mutation over time. The game shows the percentage similarity of the current sequence to the original sequence, showing the amount of 'evolution' over the time period of the game. We discussed with visitors the relevance of this to the development of antibiotic resistance.

              Wednesday, 8 April 2015

              World Health Day: Food-borne disease theme

              For World Health Day 2015, the group's research into food-borne campylobacter infection was featured on the Nuffield Department of Medicine's home page. The piece features recent work Bethany Dearlove and I have conducted into zoonotic (animal-human) transmission with Sam Sheppard. The paper is currently under review, and a preprint can be downloaded from the website.

              Tuesday, 31 March 2015

              ClonalFrameML: accounting for recombination in bacterial phylogenies

              Horizontal gene transfer in bacteria, mediated by transformation, transduction or conjugation, can result in gain, loss and replacement of genes. The replacement of horizontally transferred genes or gene fragments in a process known as homologous recombination has far-reaching effects on bacterial phylogenetics - the study of relatedness between bacteria. A new method published by Xavier Didelot and me last month in PLoS Computational Biology corrects for these distorting effects of homologous recombination on bacterial phylogenies.

              Two forms of phylogenetic distortion are caused by recombination. The first affects the shape of the tree topology. Although this is a potentially serious difficulty, Jessica Hedge and I recently showed that phylogenies estimated from whole bacterial genomes are surprisingly robust to this problem. The second affects the lengths of the branches. When genetic material is replaced by a homologous but distantly related sequence, it gives the appearance of a cluster of substitutions in the genome, and this can exaggerate branch lengths. ClonalFrameML detects these clusters of substitutions, identifies them as recombination events, and corrects the branch lengths of the tree.

              Correcting for recombination is important in a variety of settings. In transmission studies, recent transmission between patients can be detected by comparing the genomes of the infecting bacteria. As we show in the paper, ClonalFrameML improves detection of transmission events by accounting for the tendency of recombination to elevate the evolutionary distance between genomes. We also report the discovery of a remarkably large chromosomal replacement event spanning 310 kilobases that may have led to the evolution of the ST582 strain of Staphylococcus aureus, underlining the importance of recombination over short and long timescales.

              ClonalFrameML is a much faster implementation of the popular ClonalFrame method by Xavier and Daniel Falush. It is based on the same underlying assumptions and the same explicit evolutionary model, so it provides interpretable estimates of rates of recombination, the length of DNA imported by recombination, and the relative impact of recombination versus mutation. However, it can now analyse thousands of whole bacterial genomes in a matter of hours, representing a substantial improvement over the earlier method.

              Friday, 28 November 2014

              New paper: bacterial phylogenetic inference is robust to recombination but demographic inference is not

              Published this week in mBio, Jessica Hedge's new paper "Bacterial phylogenetic inference is robust to recombination but demographic inference is not" looks at a long-standing problem: why are phylogenetic trees so popular in bacterial genomics when everyone knows recombination (which is detectable in most species studied) leads to seriously misleading inference? A burst of research activity in the early 2000s showed that homologous recombination - which can result from various forms of horizontal gene transfer in bacteria - can distort phylogenetic trees and lead to false inference of positive selection and demographic growth in methods that rely on them.

              In the intervening years there has been intense research in the field of population genetics into approaches that account for recombination, although the practically useful methods rely on approximations because of the inherent difficulties of learning about complex reticulated evolutionary networks that recombination generates. This has led many of my population genetics colleagues to regard - at least privately - the use of phylogenetic trees in recombining species as "bust", and the conclusions drawn from such studies as questionable. In this paper we show that this view is too simple.

              FIG 1 

              Friday, 6 June 2014

              Cheltenham Science Festival

              Earlier this week members of the group represented the Nuffield Department of Medicine at the Cheltenham Science Festival with our Modernising Medical Microbiology stall, featuring the Antibiotic Resistance Coconut Shy and the Genome Evolution Dance Mat.

              Antibiotic Resistance Coconut Shy
              Antibiotic Resistance Coconut Shy: The children (and adults) visiting the stall were given five bean bags (antibiotics) to throw at the coconuts (bacterial pathogens) to try to knock them off. The front row of coconuts, representing bacteria more susceptible to antibiotics, were easier to knock off than the back row, which represented more resistant bacteria. The aim was to show the children that an unwanted side effect of using antibiotics is to increase the frequency of resistant bacteria, because they were usually the ones left standing.

              The game was more difficult than it looks, and just one visitor knocked off all five coconuts. We gave out NDM pens to the sixty visitors who managed to knock off three or more.

              Microscope and Top Trumps
              Digital Microscope: We brought along a light microscope to show the children what bacteria really look like, which helps emphasize how small they are since they are difficult to see even under the highest magnification. We prepared slides for several Gram positive and Gram negative species, and provided a key to help identify them. We also brought along a number of games that have been used in previous departmental outreach activities, including Pathogen Top Trumps and Fact or Fiction.

              Genome Evolution Dance Mat
              Genome Evolution Dance Mat: In this game, the children had to copy a bacterial DNA sequence by replicating a sequence of dance moves (up=A, left=C, right=G, down=T) without introducing new errors (mutations). Any mutations that were introduced were passed on to the next template sequence. In this way we aimed to show how mutations occur by errors in DNA replication, and that they are inherited. This generates unique DNA fingerprints for bacteria, which we can use to track the spread of outbreaks.

              Outbreak Map
              The game, which was kindly programmed by Gareth Jenkin-Jones, included a form of natural selection, so that if too many errors were introduced at once, the sequence was considered inviable and did not survive to be passed on. There was also a speed control, which was handy since some people appear to have spent a lot more of their youth playing dance mats than others.

              Outbreak Map: We made an Outbreak Map to show the reach of our stall over the day, with visitors that scored highly on the coconut shy pushing in pins to show where they had travelled from. Had we been handing out germs instead of pens, we could have started outbreaks as far afield as Edinburgh, France and Spain, as well as a large cluster in Cheltenham and the surrounding counties.

              Other research groups are representing the department throughout the week.

              NDM Microbiology Stall at the Cheltenham Science Festival (L-R): Sarah Earle, Louise Pankhurst, Danny Wilson, Liz Batty, Dilrini De Silva, Jess Hedge, Catrin Moore. Amy Mason, Gareth Jenkin-Jones and Jane Charlesworth also helped with the preparations, and Jen Bardsley co-ordinated all the NDM Stalls.

              New paper: Mobile elements drive recombination hotspots in the core genome of Staphylococcus aureus

              This week published in Nature Communications we have a new open access paper looking at what drives variability in rates of recombination (horizontal gene transfer, HGT) in the core genome of Staphylococcus aureus. HGT in the core genome is important for eliminating harmful mutations and promoting the spread of beneficial mutations, such as those that make the bacteria resistant to antibiotics.

              Compared to recent work focusing on individual, highly-related strains of S. aureus, we found much higher rates of core HGT across the species as a whole. We saw that the frequency of HGT varies along the genome. At broad scales, core HGT is higher near the origin of replication, a pattern reminiscent of the one described by Eduardo Rocha and colleagues in E. coli, who hypothesized that the over-abundance of DNA near the origin during rapid growth could promote HGT.

              At fine scales, we found more frequent HGT in regions of the core genome close to mobile elements. The hottest regions occurred near mobile regions called ICE6013, SCC and genomic island α. The insertion and excision of mobile elements from the genome represents a type of HGT, so our finding that nearby core regions also experience more HGT suggests there is some sort of "spill over". This idea is supported by work in Ashley Robinson's group that found similarities between ICE6013 and a class of mobile elements in Streptococcus agalactiae called TnGBS2. TnGBS2 was discovered by Phillipe Glaser's lab who showed it sometimes transfers large tracts of adjacent core material during conjugation.

              Whether conjugation alone can explain the high levels of core HGT we saw in S. aureus is unclear - our results suggest there is detectable HGT even in core regions far from mobile elements. Transformation is another possible mechanism of core HGT, but S. aureus is generally thought to be naturally incapable of transformation. However, intriguing work published by Tarek Msadek and colleagues in 2012 indicates there may be cryptic mechanisms of transformation in S. aureus after all. It remains to be seen whether the relative contributions of transformation, transduction and conjugation to the long-term evolution of S. aureus can be disentangled.

              Tuesday, 1 October 2013

              The role of hospital transmission in Clostridium difficile infection

              This week the Modernising Medical Microbiology consortium at Oxford published the findings of a six-year study into the transmission of the hospital "superbug" Clostridium difficile. The research, which appears in the New England Journal of Medicine, shows that the majority of new cases cannot be traced to other infections in hospital, and indicates instead that there must be a large, as yet unidentified, reservoir of C. difficile infectious to humans. This finding is important because it suggests that there is a limit to which more and more intense hospital cleaning - important though it has been - can continue to have in reducing C. difficile infection.

              The research, which is the result of a tireless effort by a large number of my colleagues - notably David Eyre, Tim Peto and Sarah Walker - used bacterial whole genome sequencing to detect within-hospital transmission by searching for extremely closely related bacterial strains among more than 1200 cases of C. difficile infection that occurred in Oxfordshire between September 2007 and March 2011. The consortium is currently developing the approach for routine microbiology diagnostics and infection control, with a view to eventual roll-out across the NHS.

              Friday, 20 September 2013

              Postdoctoral Position in Statistical Genomics

              The position of Postdoctoral Scientist is available in my group to lead research on the Wellcome Trust and Royal Society funded project Statistical Methods for Whole Genome Phenotype Mapping in Bacterial Populations.

              Bacteria cause disease throughout the world. Different strains vary in disease severity, but the genetic variants responsible remain largely undiscovered. Recent breakthroughs in whole genome sequencing provide new opportunities for discovery, but the lack of statistical analysis tools tailored to the special structure of bacterial populations presents a roadblock. The goal of the project is to develop an analysis framework for mapping genes underlying naturally variable traits in bacterial populations. Focusing on the hospital-associated pathogens Staphylococcus aureus and Clostridium difficile, we will investigate the role of bacterial variants on disease severity.

              The role of the Postdoctoral Scientist is to develop novel statistical methods for analysing genotype-phenotype associations in bacteria at the whole genome level. The successful candidate will write software implementing the statistical methods and apply them to design and carry out investigations into the genetic basis of virulence in natural populations of bacterial pathogens. The ideal candidate would be a recently graduating PhD student with experience of statistical genetics and computer programming, with evidence of publicly released software. Experience of population genetics or microbiology would be advantageous but is not essential.

              The post is available immediately, and is available for up to 3 years in the first instance. For more details on this position, including salary, job description, selection criteria and how to apply, please see the University of Oxford recruitment page.

              Applications for this vacancy are to be made online. The closing date is 12.00 noon on Monday 4 November 2013. Applicants will be asked to upload a CV and a supporting statement as part of the online application. For informal enquiries, please email me. More information about the group's research is available here.

              Tuesday, 17 September 2013

              Sir Henry Dale Fellowship

              I am pleased to report that I have been awarded a Wellcome Trust and Royal Society funded Sir Henry Dale Fellowship. The subject of the fellowship, to be held in the Nuffield Department of Medicine at the University of Oxford, is Statistical Methods for Whole Genome Phenotype Mapping in Bacterial Populations.

              The project addresses the question of how to detect genes or mutations in bacteria responsible for variability in important traits such as the tendency to cause human disease. Focusing on the hospital-associated pathogens Staphylococcus aureus and Clostridium difficile, the project has the potential to help identify genetic variants that explain why some bacteria cause more severe infections, knowledge that could help develop new drugs and tests that improve patient treatment.

              The fellowship runs for five years, and includes support for a postdoctoral research assistant and laboratory costs. I will be advertising a position shortly. If you are interested, please get in touch.

              I want to thank the funders and reviewers for supporting this project, and my colleagues who helped me write and re-write the research proposal.

              Thursday, 6 June 2013

              Detecting mixed strain infections with whole genome sequencing

              Whole genome sequencing in near-to-real time is set to become a routine tool for outbreak detection by hospital and public health microbiology labs, following successful pilot studies in the UK last year. Typically, the bacteria are cultured from a clinical sample, and a single colony is picked for sequencing. Since a bacterial colony grows from a single cell, this procedure ensures that all the cells picked for sequencing are genetically identical, and this in turn helps piece the genome back together again following sequencing.

              But it exposes the system to a flaw. What would happen if a patient sick with two strains transmitted one, but not the other to a second patient? Characterizing the genome of just one of the strains in the first patient risks missing the transmission event entirely, because the "wrong" strain might have been sequenced.

              One safeguard would be to sequence multiple bacterial colonies per sample, three for example. But this would increase the cost of routine surveillance three-fold.

              In a new paper published this month in PLoS Computational Biology, with David Eyre, Madeleine Cule, Sarah Walker and others, we have investigated an alternative solution, where by a large number of colonies gets sequenced all together. The cost is the same as that of sequencing a single colony. But the downstream bioinformatics analysis is complicated considerably by the presence of multiple strains. To cope with this, we developed a new computational method that reconstructs the identities of the multiple strains, using a panel of reference genomes to help where possible.

              By applying the approach to 26 clinical samples of Clostridium difficile hospital infections with known epidemiological relationships, we detected four mixed strain infections, one of which revealed a previously undetected transmission event within the hospital. For full details, read the open access paper.

              Wednesday, 22 May 2013

              Within-host evolution of Staphylococcus aureus during asymptomatic carriage

              Given its notoriety as one of the world's major causes of infection-related deaths, it may come as a surprise that one in three healthy adults carry the human pathogen Staphylococcus aureus in their noses without adverse effects. Indeed, most people carry the bacteria at some point in their lives. So carriage must be seen as the normal state of affairs in the human-S. aureus interaction, and by understanding this state better we can improve our understanding of why, in some people, the bacteria go on to cause life-threatening invasive disease.

              This month sees publication of an investigation by my colleagues and me into the evolution of S. aureus during this normal healthy carriage state. The carriers in our study harboured populations of the bacteria that were very closely related but typically not identical, implying that the bacteria had evolved within the human body. The nose appears to be a microcosm of evolution for S. aureus, showing all the different types of genetic variation known at the species level within the noses of these individual carriers. For the most part, within-host evolution of the bacteria was very conservative, but certain proteins expressed on the surface of the bacteria and toxins secreted by the bacteria showed evidence of involvement in a host-pathogen arms race.

              The paper, whose lead authors include Tanya Golubchik, Liz Batty, Derrick Crook and Rory Bowden, has received coverage on the EveryONE blog and F1000. I liked Gerald Pier's conclusion, made on the post-publication peer review website: "Given that about 30% of the world's seven billion-plus humans, and an unknown number of animals, are chronically colonized with S. aureus, the tremendous opportunity provided to this organism for generating genetic variation to counteract human efforts to prevent S. aureus infections may be one of the most formidable barriers to overcome in order to develop vaccines and highly effective interventions to lessen the impact of this organism on human and animal health."

              Monday, 4 February 2013

              Coalescent inference for infectious disease

              Today my student Bethany Dearlove has her first paper published, called Coalescent inference for infectious disease: meta-analysis of hepatitis C. In this paper, published in Philosophical Transactions of the Royal Society B, we have developed coalescent-based population genetics methods for popular, deterministic, epidemiological models known as SI (susceptible-infectious), SIS (susceptible-infectious-susceptible) and SIR (susceptible-infectious-recovered). By implementing these methods in BEAST, we were able to re-analyse previously published hepatitis C virus datasets and directly estimate epidemiological parameters. Our results show that, in the absence of co-infection, the widely-used exponential growth and logistic growth models of changing population size correspond directly to SI and SIS dynamics. We were also able to examine the limitations to genetic approaches to reconstructing epidemiological dynamics.

              This paper appears as part of an issue on Next-generation molecular and evolutionary epidemiology of infectious disease, which accompanies a Royal Society discussion meeting organized by Oli Pybus, Christophe Fraser and Andrew Rambaut. The Royal Society has made audio recordings of the talks at this meeting, and the accompanying satellite meeting, available online, including my talk on Bethany's paper.

              Thursday, 15 November 2012

              Postdoctoral Positions in Pathogen Genomics

              These positions are now closed. There are currently seven posts advertised to join the Pathogen Genomics group at the Nuffield Department of Medicine in Oxford. Prof Derrick Crook and colleagues are seeking exceptional, creative, quantitatively minded scientists to join a multidisciplinary team of researchers using population genomics to understand the evolution and transmission of human pathogens. We are seeking to appoint a number of promising young researchers to extend our existing strengths in the areas of phylogenomics, statistical genetics and bioinformatics.

              The group is studying a range of bacterial and viral pathogens including tuberculosis, Staphylococcus aureus, Clostridium difficile, HIV, norovirus and hepatitis C virus. Our research interests include within-host evolution, the genetic basis of virulence, transmission dynamics and outbreak investigation via real-time genomics.

              A major translational goal of the project is to exploit the transformative effect of population genomics on bacteriology to improve routine clinical practice in public health and microbiology laboratories.

              The research is supported by the UKCRC Modernising Medical Microbiology Consortium, the Health Innovation Challenge Fund, the NHS National Institute for Health Research, the Oxford Biomedical Research Centre, Institut Merieux and the Oxford Martin School, and pursued in collaboration with clinical colleagues in Leeds, Birmingham and Brighton, the Health Protection Agency and the WTSI.
              The deadline for applications varies by position, between 26-28 November 2012.
              For examples of recent papers see:

              For more information visit:

              Monday, 5 November 2012

              James Martin Fellowship

              This position is now closed. A prestigious James Martin Fellowship funded by the Oxford Martin School is available in my research group for a highly motivated and creative population geneticist interested in developing cutting edge methods for the analysis of high-throughput whole genome sequencing data to better understand the evolution and epidemiology of the major pathogens HIV and Hepatitis C Virus.

              The position, which is part of the Curing Chronic Viral Infections project, is fully funded for three years and is affiliated with the Institute for Emerging Infections, the Modernising Medical Microbiology consortium, the Peter Medawar Building for Pathogen Research and the Nuffield Department of Medicine. The ideal candidate will have a track record in statistical or computational genetics and experience of programming in a language such as C++ or Java.

              Full details can be found on the University of Oxford Recruitment website. Please send informal enquiries, with a CV, to me by email. The deadline for applications is 12 noon on 27th November 2012.