Wurm lab: home | |

Practicals for 2022 Genome Bioinformatics module.

Introduction

Cheap sequencing has created the opportunity to perform molecular-genetic analyses on just about anything. Conceptually, doing this would be similar to working with traditional genetic model organisms. But a large difference exists: For traditional genetic model organisms, large teams and communities of expert assemblers, predictors, and curators have put years of efforts into the prerequisites for most genomic analyses, including a reference genome and a set of gene predictions. In contrast, those of us working on “emerging” model organisms often have limited or no pre-existing resources and are part of much smaller teams. Emerging model organisms includes most crops, animals and plant pest species, many pathogens, and major models for ecology & evolution.

At the end of this module, you should be able to:

  1. inspect and clean short (Illumina) reads,
  2. perform genome assembly,
  3. assess the quality of the genome assembly using simple statistics,
  4. predict protein-coding genes,
  5. assess quality of gene predictions,
  6. assess quality of the entire process using a biologically meaningful measure.

NOTE:_ The exemplar datasets are simplified to run on laptops and to fit into the short format of this course. For real projects, much more sophisticated approaches are needed!


1. Prerequisites

Prerequisites for the practicals are:

3. Computers

To perform the practicals, you will remotely connect to the Amazon Web Services (AWS) (here, for more informations). You will use an SSH (here for more information), client to connect to a remote shell, where you will run the first three practicals. Some results will be available on a personal web page created for the course. The same web page will allow you to perform the fourth and fifth practicals.

4. Authors/Credits

The initial version of this practical was put together by * Yannick Wurm @yannick__ * Oksana Riba-Grognuz’ contributions for the 2012 edition of this course

It was heavily heavily heavily revised and improved thanks to efforts and new content by

5. Things used in other versions of this course: