next.pathogen.watch docs
  • Welcome to Pathogenwatch
  • News & Release Notes
    • Announcements
    • Release Notes 2025
    • Release Notes 2024
    • Release Notes 2023
    • Release Notes 2022
    • Release Notes 2019-2021
  • Getting Started
    • Sign in
    • A Brief Tour of Pathogenwatch
    • Interactive Collection View tutorial
    • Useful Links
  • How to use Pathogenwatch
    • Using the documentation
    • Using The Interactive Collection View
      • The Interactive Collection View
      • The Map Panel
      • The Tree Panel
        • Tree Panel
        • Generating a new tree
      • Data Tables
      • The Timeline Panel
      • Context search panel
      • Legend, Labels, and Colours
      • Searching genomes in a collection
      • Creating sub-collections
    • Genome Uploads & Folders
    • Browsing & Searching Genomes
    • Browsing Collections
    • Creating & Sharing Collections
    • Genome Reports
    • Deleting items
    • SARS-CoV-2 Tutorial
    • Tips and Tricks
  • Technical Descriptions of Analysis Tools
    • Genome Assembly
      • Short Read Assembly
      • Assembling genomes with EToKi
    • Plasmid Annotation
      • Inctyper
    • Assigning species with Speciator
    • Trees, Clustering, and Context Search
      • Core Genome Tree
        • About SNP-based trees
        • Core Assignment
        • Core Filter
        • Reference Assignment
        • Tree Construction
      • cgMLST Clustering & Context Searching
      • SARS-CoV-2 Genome Tree
      • cgMLST Tree
    • Lineage Assignment & Genotyping Methods
      • Genotyphi
      • Kleborate
      • cgMLST
      • Klebsiella LIN Codes
      • MLST
      • NG-MAST
      • Pangolin
      • PopPUNK
      • Vista
      • Finding HierCC codes with hclink
      • SARS-CoV-2 Notable Mutations
    • Serotyping
      • Kaptive
      • SeroBA
      • SISTR
      • ECTyper
    • Antimicrobial Resistance Prediction
      • Pathogenwatch AMR
      • Kleborate AMR
      • SPN-PBP-AMR
      • Resfinder
    • Virulence
      • STECFinder
      • VirulenceFinder
      • BIGSdb schemes
  • WHO bacterial priority pathogens
  • Initiatives powered by Pathogenwatch
    • PATH-SAFE
      • PATH-SAFE Sign in
      • What is the PATH-SAFE Programme?
      • PATH-SAFE powered by Pathogenwatch
      • Two-tool Serotyping with SISTR & SeqSero2
      • S. enterica SNP tree
      • PATH-SAFE analyses
  • How to cite
  • Acknowledgements
  • Privacy and Terms Of Service
  • FAQ
  • Report an Issue
Powered by GitBook
On this page
  • About
  • How to cite
  1. Technical Descriptions of Analysis Tools
  2. Serotyping

SeroBA

PreviousKaptiveNextSISTR

Last updated 5 months ago

About

These results should not be used for clinical purposes or to inform vaccine programmes. Since the result is based on inference from the DNA sequence rather than a Quellung reaction (gold standard for serotyping) the result may in some cases not match the phenotypic result. However the methodology used by Pathogenwatch based on SeroBA has been shown to have a sensitivity and specificity of 0.98 and 1, respectively ().

SeroBA predicts a phenotype starting directly from short read data. Pathogenwatch uses assemblies as the starting genomic data, from which reads are simulated for the purposes of SeroBA. Because of this small difference in methodology 0.14% (28/20049) mismatches are observed between results direct from reads and those from assemblies. These are reported below and a result that may be subject to these differences is flagged with a 'Guidance' link.

Pathogenwatch Serotype

SeroBA Serotype

No. Mismatches (%) [a]

BLAST cps loci Nucleotide Similarity

BLAST cps loci Nucleotide Coverage

Distinguishing Genetic Features [b]

untypable [c]

19A

9 (0.6)

-

-

-

32F

32A

3 (100) [d]

99

99

5 bp gap at the intergenic region between wcrN and the HG272/3 pseudogene

32F

untypable

2 (NA)

-

-

-

33A/33F

33F

2 (1.1)

99.9

92.0

Frameshift mutation insT 433 in 33F wcjE gene

possible 6A

6A

2 (0.2)

-

-

-

11E

11A

1 (0.2)

- [e]

-

Disruption in wcjE

19F

untypable

1 (NA)

-

-

-

32A

32F

1 (14)

99

99

5 bp gap at the intergenic region between wcrN and the HG272/3 pseudogene

32A

untypable

1 (NA)

-

-

-

35A

35C

1 (6.7)

98.9

90

Frameshift mutation insA 248 in wcrK encodes for a GT—consistent with differences in 35A wcrK

6A

possible 6E

1 (NA)

-

-

-

possible 6C

6B

1 (0.09)

99

92

wciNα in 6B / wciNβ in 6C

possible 6D

6C

1 (0.25)

98.6

84

A > G 583 in wciP

possible 6E

6B

1 (0.09)

-

-

-

untypable

23F

1 (0.08)

-

-

-

[a] Percentage is calculated by the number of isolates that mismatched between Pathogenwatch and SeroBA over the total number of isolates for each serotype indicated on the same row typed by SeroBA.

[e] Complete sequence of cps loci for 6E and 11E are not available for comparison.

NA = not available

How to cite

Epping L, van Tonder AJ, Gladstone RA, et al. SeroBA: rapid high-throughput serotyping of Streptococcus pneumoniae from whole genome sequence data [published correction appears in Microb Genom. 2018 Aug;4(8). doi: 10.1099/mgen.0.000204]. Microb Genom. 2018;4(7):e000186. doi:10.1099/mgen.0.000186

[b] Information extracted from .

[c] See . The samples tested here were QC-passed, therefore the untypable results are likely due to low coverage of the cps region.

[d] Only serological analyses can reliably differentiate serotype 32A and 32F. In silico serotype within serogroup 32 is subject to improvement due to the small number of isolates for analysis ().

Epping et al 2018
Kapatai et al 2016
https://github.com/sanger-pathogens/seroba#troubleshooting
Kapatai et al 2016