Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework

TitleDeconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework
Publication TypeJournal Article
Year of Publication2019
AuthorsGan GL, Willie E, Chauve C, Chindelevitch L
JournalBMC Bioinformatics
Volume20
Start Page637
IssueSuppl 20
Date Published12/2019
KeywordsBacterial diversity, borrelia burgdorferi, Integer Linear Programming, Multi-Locus Sequence Typing
Abstract

We introduce a framework for understanding the within-host diversity of a pathogen using multi-locus sequence types (MLST) from whole-genome sequencing (WGS) data. Our approach consists of two stages. First we process each sample individually by assigning it, for each locus in the MLST scheme, a set of alleles and a proportion for each allele. Next, we associate to each sample a set of strain types using the alleles and the strain proportions obtained in the first step. We achieve this by using the smallest possible number of previously unobserved strains across all samples, while using those unobserved strains which are as close to the observed ones as possible, at the same time respecting the allele proportions as closely as possible. We solve both problems using mixed integer linear programming (MILP). Our method performs accurately on simulated data and generates results on a real data set of Borrelia burgdorferi genomes suggesting a high level of diversity for this pathogen.

URLhttps://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3204-8