Ants, Bees, Genomes & Evolution @ Queen Mary University London
GeneValidator is a tool to identify problematic gene predictions based on comparisons between gene predictions and similar sequences in public databases (e.g., SwissProt). Funded by NESCent Google Summer of Code 2013 and BBSRC TRDF.
GeneValidator works locally from the command line or from a web browser (suitable for <100 queries). Either amino acid sequences or nucleotide sequences (e.g., putative CDS) can be used as input. Output formats include HTML report and plain (parseable) text. Some examples below.
Description | Input | Output* |
---|---|---|
Nucleotide sequences (picked from European Nucleotide Archive) | FASTA | HTML, JSON, CSV, Summary CSV |
Amino acid sequences (picked from ongoing genome projects) | FASTA | HTML, JSON, CSV, Summary CSV |
* Generated using GeneValidator 2.1.5, using SwissProt database downloaded on 17th August, 2018 as reference.
To install GeneValidator on a Unix-based system (e.g. Linux or Mac OS), please run the following in the terminal:
sh -c "$(curl -fsSL https://install-genevalidator.wurmlab.com)"
Please see this page for more information on installation and usage.
Alternatively, we host a web server appropriate for < 10 query sequences at a time:
Despite recent improvements in genome sequencing and gene prediction technologies, many gene predictions remain problematic. GeneValidator can be used to help assess the quality of a large set of gene predictions, but also for individual sequences. Here we focus on the latter.
What can you conclude regarding your query gene prediction sequences? Regarding the example sequences given above, the following can help you understand GeneValidator's output: