Pannotator: prokaryotic genome annotation at scale

Pannotator is a scalable and robust pangenome-based prokaryotic genome annotation tool, designed to efficiently process thousands of genomes. It is built upon Bakta to reliably annotate protein-coding and ncRNA genes, while leveraging the scalability and reproducibility of Nextflow.

Description

Pannotator orchestrates Bakta annotation steps in a modular Nextflow pipeline. It supports the annotation of ncRNA cis-regulatory regions, oriC/oriV/oriT, assembly gaps, as well as tRNA, tmRNA, rRNA, ncRNA genes, CRISPRs, CDSs, and pseudogenes via Bakta.
To minimise redundant computation, Pannotator clusters CDS features across genomes and annotates only representative sequences from each cluster, propagating annotations back to cluster members.

Installation

Prerequisites:

Nextflow >= 21.04.0
Conda or Docker/Singularity

To use Pannotator you need to clone the repo and add path to the executable to the PATH variable:

git clone --recurse-submodules https://github.com/sysbio-vo/pannotator.git
cd pannotator
echo "export PATH=\"\$PATH:$(pwd)/nxf_bin\"" >> ~/.bashrc

Examples

To annotate a folder of genomes using an existing Bakta database:

pannotator --indir /path/to/folder/with/genomes --outdir /path/to/output/folder --bakta_db /path/to/bakta_db

To output intermediate files, such as MMseqs2 clustering results, proteome FASTA file containing all unique sequences, and others, use the --save_intermediate flag.

pannotator --indir /path/to/folder/with/isolates/ --outdir /path/to/output/folder --bakta_db /path/to/bakta_db --save_intermediate

If you don't have a Bakta database, the most recent version will be automatically downloaded during the first run. Note that it might take some time, as the light database v6.0 is ~1.3 GB, while the full database is ~33.1 GB. The light database is downloaded by default. You can specify the required database type through the command line:

pannotator --indir /path/to/folder/with/isolates/ --outdir /path/to/output/folder --bakta_db /path/to/save/bakta/db/ --bakta_db_type [light|full]

Available generic execution profiles adapted from the base config by PaM.:

standard (default)
docker
singularity
conda

Examples for the Wellcome Sanger Institute's Farm users

Instead of cloning the repository, you can use a dedicated pannotator module on Farm, which is maintained to be up to date with the upstream codebase. To start using it, you need to load the environment first:

module load PaM/environment
module load pannotator

After that, you can use the tool by calling pannotator. We recommend using the sanger_lsf profile when running the pipeline on Farm. For instance, to annotate a folder of genomes with Pannotator, run:

pannotator --indir /path/to/folder/with/genomes --outdir /path/to/output/folder --bakta_db /path/to/bakta/db -profile sanger_lsf

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
bin		bin
configs		configs
environment		environment
lib @ 1685074		lib @ 1685074
modules		modules
nxf_bin		nxf_bin
subworkflows		subworkflows
tests		tests
.bumpversion.toml		.bumpversion.toml
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.talismanrc		.talismanrc
LICENSE.md		LICENSE.md
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
pannotator.sh		pannotator.sh
pannotator.svg		pannotator.svg
schema.json		schema.json
update_base_config.sh		update_base_config.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pannotator: prokaryotic genome annotation at scale

Description

Installation

Examples

Examples for the Wellcome Sanger Institute's Farm users

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pannotator: prokaryotic genome annotation at scale

Description

Installation

Examples

Examples for the Wellcome Sanger Institute's Farm users

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages