pyseer is a python reimplementation of seer, which was written in C++.
pyseer uses linear models with fixed or mixed effects to estimate the
effect of genetic variation in a bacterial population on a phenotype of
interest, while accounting for potentially very strong confounding population
structure. This allows for genome-wide association studies (GWAS) to be performed in
clonal organisms such as bacteria and viruses.
The original version of
seer used sequence elements (k-mers) to represent
variation across the pan-genome.
pyseer also allows variants stored in VCF
files (e.g. SNPs and INDELs mapped against a reference genome) or Rtab files
(e.g. from roary or
piggy to be used too). There are also a greater range of association models
available, and tools to help with processing the output.
Testing shows that results (p-values) should be the same as the original
seer, with a runtime that is roughly twice as long as the optimised C++
If you find pyseer useful, please cite:
Lees, John A., Galardini, M., et al. pyseer: a comprehensive tool for microbial pangenome-wide association studies. Bioinformatics 34:4310–4312 (2018). doi:10.1093/bioinformatics/bty539.
If you use unitigs (through unitig-counter) please cite:
Jaillard M., Lima L. et al. A fast and agnostic method for bacterial genome-wide association studies: Bridging the gap between k-mers and genetic events. PLOS Genetics. 14, e1007758 (2018). doi:10.1371/journal.pgen.1007758.
- pyseer documentation
- Option reference
- Reference documentation