Tuesday, 7 December 2021

Two new positions: Senior Statistical Geneticist and Bioinformatician

Two new positions are available in my Infectious Disease Genomics group at the Big Data Institute, University of Oxford.

A Senior Postdoctoral Statistical Geneticist to jointly lead the implementation, design and application of new statistical tools for genome-wide association studies, lead the biological interpretation of key findings, develop methodologies and supervise junior group members. This post would suit a candidate with a PhD and relevant post-doctoral experience including direct experience in statistical genetics. Candidates without post-doctoral experience may be considered for a less senior appointment.

A Bioinformatician to provide expertise for computationally intensive analyses including genome-wide association studies and RNAseq studies of differential gene expression, as well as contributing to informatics projects as part of a wider collaboration with national biomedical cohorts. This post would suit a candidate with either a post-graduate degree related to Bioinformatics, Statistics, and Computing or equivalent experience in industry.

The application deadline for both posts is Noon GMT on Friday 7th January 2022.

New paper: Machine learning to predict the source of campylobacteriosis using whole genome data

This study, published in October in PLOS Genetics, brings together machine learning, large bacterial isolate collections and whole genome sequencing to address the general problem of how to trace the source of human infections.

Specifically, we investigated campylobacteriosis, a common infection of animal origin causing ~1.5 million cases of gastroenteritis and 10,000 hospitalizations every year in the United States alone. We show that our combined machine learning/genomics analyses:

  • Improve the accuracy with which infections can be traced back to farm reservoirs.
  • Identify evolutionary shifts in bacterial affinity for livestock host species.
  • Detect changes in human infection capability within related strains.

These results will improve understanding not only of Campylobacter, but more generally as these technologies can readily be applied to other important bacterial pathogen species.

This paper builds on previous work published by the group, including our well cited Tracing the source of campylobacteriosis (Wilson et al 2008, PLOS Genetics 4:e1000203). The use of these methods for tracing infection has influenced public health policy and contributed to reducing disease burden.

This work demonstrates the potential for modern genomics and artificial intelligence approaches to address common and serious problems that affect our everyday lives. The awareness of the importance of infection to society has rarely been higher than in 2021, and while the current pandemic imposes an acute global problem, other infections continue to present long-term threats to health and productivity.

This work was led by Nicolas Arning, in collaboration with David Clifton and Sam Sheppard.

New paper: Antimicrobial resistance determinants are associated with Staphylococcus aureus bacteraemia and adaptation to the healthcare environment

Staphylococcus aureus is a leading cause of infectious disease deaths in all countries, with bloodstream infection leading to sepsis a major concern. This new study, published in November in Microbial Genomics, reports genes and genetic variants in Staph. aureus associated severe disease vs asymptomatic carriage, and healthcare vs community carriage.

Our genome-wide association study of 2000 bacterial genomes showed that antibiotic resistance in Staph. aureus is associated with severe disease and the hospital environment:

  • A mutation conferring trimethoprim resistance (dfrB F99Y) and the presence of a gene conferring methicillin resistance (mecA) were both associated with bloodstream infection vs asymptomatic nose carriage.
  • Separately, we demonstrated that a mutation conferring fluoroquinolone resistance (gyrA L84S) and variation in a gene involved in resistance to multiple antibiotics (prsA) were preferentially associated with healthcare-associated carriage vs community-acquired carriage.

The implication – that antibiotic resistance genes may provide survival advantages which mechanistically contribute to the development of disease – is important in the face of the continued global rise of antibiotic resistance.

We were also able to shed light on a controversy as to whether different strains of Staph. aureus differ in their propensity to cause severe disease. Interest in this question dates back decades in the literature, and contradictory studies, often based on modest sample sizes, have reached different conclusions. Our comparatively large study, using a whole-genome method that we previously published in Nature Microbiology, found that all strains of Staph. aureus are equally likely to cause severe disease vs asymptomatic carriage.

New paper: Genome-wide association studies reveal the role of polymorphisms affecting factor H binding protein expression in host invasion by Neisseria meningitidis

In this paper, published in October in PLOS Pathogens, we discovered a novel genetic association between life-threatening invasive meningococcal disease (IMD) and bacterial genetic variation in factor H binding protein (fHbp) through two bacterial genome-wide association studies (GWAS), which we validated experimentally. This was a collaboration with the groups of Chris Tang and Martin Maiden, with the work in my group led by Sarah Earle.

fHbp is an important component of meningococcal vaccines that directly interacts with human complement factor H (CFH). Intriguingly, our discovery that bacterial genetic variation in fHbp associates with increased virulence mirrors an earlier discovery that human genetic variation in CFH associates with increased susceptibility to IMD (Nature Genetics 42: 772).

Our experiments showed that the fHbp risk allele increased expression. Interestingly, increased susceptibility to IMD has been previously associated with elevated CFH expression. Therefore over-expression of either fHbp by the bacterium or CFH by the host appears to increase the risk of IMD. Since complement evasion is necessary for pathogenesis, these insights offer new leads for improving treatment.

Key results from the paper:

  • A GWAS for IMD in 261 meningococci from the Czech Republic highlighted a highly polygenic architecture of meningococcal virulence (see Figure), including capsule biosynthesis genes, the meningococcal disease association island and the new signal near the fba and fHbp genes.
  • A replication GWAS for IMD in 1295 meningococcal genomes belonging to strain ST41/44 downloaded from pubMLST.org validated the novel signal of association near fba and fHbp.
  • SHAPE reactivity analyses revealed that IMD-associated variation in the regulatory region of fHbp disrupted the ability of the cell machinery to commence gene expression.
  • Flow cytometry assays of newly constructed genetically engineered strains, in different temperatures and in the presence and absence of human serum, attributed changes in gene expression to a non-synonymous candidate mutation in the fHbp gene.

In this study, our GWAS relied exclusively on publicly available genome sequences and metadata, highlighting the untapped potential of large-scale open source databases like pubMLST.org, and the value of big data for improving our understanding of disease.

Tuesday, 13 April 2021

New positions: Data Scientist in Public Health Epidemiology and Postdoc in Statistical Methods

I am looking to fill two positions at the Big Data InstituteNuffield Department of Population HealthUniversity of Oxford: a Data Scientist in Public Health Epidemiology and a Postdoctoral Researcher in Statistical Methods.

The Big Data Institute (BDI) is an interdisciplinary research centre that develops, evaluates and deploys efficient methods for acquiring and analysing biomedical data at scale and for exploiting the opportunities arising from such studies. The Nuffield Department of Population Health (NDPH), a key partner in the BDI, contains world-renowned population health research groups and is an excellent environment for multi-disciplinary teaching and research.  

The role of the Data Scientist in Public Health Epidemiology is to help pilot a project developing systems for continuous record linkage between a large Public Health England (PHE) data source and other population health records, with the aim of facilitating research into infectious diseases.

The post holder will manage and develop record linkage algorithms comparing records with relational databases containing health records via appropriate anonymization protocols, and manage and develop systems for identifying incoming records of interest, for near-real time updating of SQL databases, and for issuing email and SMS alerts in response to these events. The responsibilities will also include contributing to large-scale statistical studies using public health records to investigate disease epidemiology, and analysing and interpreting results, reviewing and refining working hypotheses, writing reports and presenting findings to colleagues.

To be considered, applicants will hold a degree in Computer Science, Data Science, Statistics, or another relevant subject with a strong quantitative component, or have equivalent experience. They will also need an understanding of relational database construction and SQL queries, experience coding in at least one common programming language (e.g. C#, Java, Python) and good interpersonal skills with the ability to work closely with others as part of a team, while taking personal responsibility for assigned tasks.

The role of the Postdoctoral Researcher in Statistical Methods is to develop statistical methods based on the harmonic mean p-value (HMP) approach. The HMP bridges classical and Bayesian approaches to model-averaged hypothesis testing, with applications to very large-scale data analysis problems in biomedical science.

The post holder will join a team with expertise in statistical inference, population genetics, genomics, evolution, epidemiology and infectious disease. The responsibilities will include developing statistical methods based on the HMP, undertaking research under the direction of the principal investigator, helping with supervision within the project as required, driving forward manuscripts for publication in collaboration with group members and disseminating results through other means such as academic conferences.

To be considered, applicants will hold, or be close to completion of, a PhD/DPhil involving statistical methods development and a track record of publication-quality methods development in statistical theory or methods development. The ability to work independently in pursuing the goals of an agreed research plan and excellent interpersonal skills and the ability to work closely with others as a team are also essential.

The closing date for both positions is noon on the 5th May 2021. Only applications received through the online system will be considered:

Presentation: Genome-wide association studies of COVID-19

An updated version of this talk given at the Nuffield Department for Population Health's annual symposium 2021: