next.pathogen.watch docs
  • Welcome to Pathogenwatch
  • News & Release Notes
    • Announcements
    • Release Notes 2025
    • Release Notes 2024
    • Release Notes 2023
    • Release Notes 2022
    • Release Notes 2019-2021
  • Getting Started
    • Sign in
    • A Brief Tour of Pathogenwatch
    • Interactive Collection View tutorial
    • Useful Links
  • How to use Pathogenwatch
    • Using the documentation
    • Using The Interactive Collection View
      • The Interactive Collection View
      • The Map Panel
      • The Tree Panel
        • Tree Panel
        • Generating a new tree
      • Data Tables
      • The Timeline Panel
      • Context search panel
      • Legend, Labels, and Colours
      • Searching genomes in a collection
      • Creating sub-collections
    • Genome Uploads & Folders
    • Browsing & Searching Genomes
    • Browsing Collections
    • Creating & Sharing Collections
    • Genome Reports
    • Deleting items
    • SARS-CoV-2 Tutorial
    • Tips and Tricks
  • Technical Descriptions of Analysis Tools
    • Genome Assembly
      • Short Read Assembly
      • Assembling genomes with EToKi
    • Plasmid Annotation
      • Inctyper
    • Assigning species with Speciator
    • Trees, Clustering, and Context Search
      • Core Genome Tree
        • About SNP-based trees
        • Core Assignment
        • Core Filter
        • Reference Assignment
        • Tree Construction
      • cgMLST Clustering & Context Searching
      • SARS-CoV-2 Genome Tree
      • cgMLST Tree
    • Lineage Assignment & Genotyping Methods
      • Genotyphi
      • Kleborate
      • cgMLST
      • Klebsiella LIN Codes
      • MLST
      • NG-MAST
      • Pangolin
      • PopPUNK
      • Vista
      • Finding HierCC codes with hclink
      • SARS-CoV-2 Notable Mutations
    • Serotyping
      • Kaptive
      • SeroBA
      • SISTR
      • ECTyper
    • Antimicrobial Resistance Prediction
      • Pathogenwatch AMR
      • Kleborate AMR
      • SPN-PBP-AMR
      • Resfinder
    • Virulence
      • STECFinder
      • VirulenceFinder
      • BIGSdb schemes
  • WHO bacterial priority pathogens
  • Initiatives powered by Pathogenwatch
    • PATH-SAFE
      • PATH-SAFE Sign in
      • What is the PATH-SAFE Programme?
      • PATH-SAFE powered by Pathogenwatch
      • Two-tool Serotyping with SISTR & SeqSero2
      • S. enterica SNP tree
      • PATH-SAFE analyses
  • How to cite
  • Acknowledgements
  • Privacy and Terms Of Service
  • FAQ
  • Report an Issue
Powered by GitBook
On this page
  • About
  • Method
  • Creating The Reference Variance Profile
  • Querying the Variance Profile
  1. Technical Descriptions of Analysis Tools
  2. Trees, Clustering, and Context Search
  3. Core Genome Tree

Reference Assignment

PreviousCore FilterNextTree Construction

Last updated 4 months ago

About

Each genome is linked to the nearest reference genome by comparing the substitutions in the core profiles to each of the reference core profiles. The reference assignment is then used to identify potentially unreliable loci in the query genome according to the variation filter method described in the section.

For some species (e.g. Salmonella Typhi), genomes with the same reference assignment will be clustered to provide a more fine-grained view, useful for large collections in the .

Method

Creating The Reference Variance Profile

  1. The core profile is generated for each reference genome.

  2. All substitutions are selected - excluding those with non-ATCG characters - and are extracted and aggregated into a single list of variant locations per gene family.

Querying the Variance Profile

  1. Each genome is compared against each reference at all the sites in the species profile, excluding sites outside the boundaries of any fragment matches.

  2. The total number of sites in common are divided by the total number of compared sites in order to generate a similarity score.

  3. The query genome is then assigned to the subgroup identified by the name of the most similar reference. If two references have the same score then then alphabetical order is used.

Core Filter
Collection View