A comprehensive map of mycobacterium tuberculosis complex regions of difference

D Bespiatykh, J Bespyatykh, I Mokrousov, E Shitikov - Msphere, 2021 - Am Soc Microbiol
D Bespiatykh, J Bespyatykh, I Mokrousov, E Shitikov
Msphere, 2021Am Soc Microbiol
Mycobacterium tuberculosis complex (MTBC) species are classic examples of genetically
monomorphic microorganisms due to their low genetic variability. Whole-genome
sequencing made it possible to describe both the main species within the complex and M.
tuberculosis lineages and sublineages. This differentiation is based on single nucleotide
polymorphisms (SNPs) and large sequence polymorphisms in the so-called regions of
difference (RDs). Although a number of studies have been performed to elucidate RD …
Abstract
Mycobacterium tuberculosis complex (MTBC) species are classic examples of genetically monomorphic microorganisms due to their low genetic variability. Whole-genome sequencing made it possible to describe both the main species within the complex and M. tuberculosis lineages and sublineages. This differentiation is based on single nucleotide polymorphisms (SNPs) and large sequence polymorphisms in the so-called regions of difference (RDs). Although a number of studies have been performed to elucidate RD localizations, their distribution among MTBC species, and their role in the bacterial life cycle, there are some inconsistencies and ambiguities in the localization of RDs in different members of the complex. To address this issue, we conducted a thorough search for all possible deletions in the WGS data collection comprising 721 samples representing the full MTBC diversity. Discovered deletions were compared with a list of all previously described RDs. As with the SNP-based analysis, we confirmed the specificities of 79 regions at the species, lineage, or sublineage level, 17 of which are described for the first time. We also present RDscan (https://github.com/dbespiatykh/RDscan), an open-source workflow, which detects deletions from short-read sequencing data and correlates the results with high-specificity RDs, curated in this study. Testing of the workflow on a collection comprising ∼7,000 samples showed a high specificity of the found RDs. This study provides novel details that can contribute to a better understanding of the species differentiation within the MTBC and can help to determine how individual clusters evolve within various MTBC species.
IMPORTANCE Reductive genome evolution is one of the most important and intriguing adaptation strategies of different living organisms to their environment. Mycobacterium offers several notorious examples of either naturally reduced (Mycobacterium leprae) or laboratory-reduced (Mycobacterium bovis BCG) genomes. Mycobacterium tuberculosis complex has its phylogeny unambiguously framed by large sequence polymorphisms that present unidirectional unique event changes. In the present study, we curated all known regions of difference and analyzed both Mycobacterium tuberculosis and animal-adapted MTBC species. For 79 loci, we have shown a relationship with phylogenetic units, which can serve as a marker for diagnosing or studying biological effects. Moreover, intersections were found for some loci, which may indicate the nonrandomness of these processes and the involvement of these regions in the adaptation of bacteria to external conditions.
American Society for Microbiology