A new year of the class Sequence a genome with our master students has started! In September, the students cultivated two strains likely to be Pseudomonas protegens, extracted their DNA and prepared it for PacBio sequencing.
All of the students produced DNA of sufficient quality and quantity:
After pooling and ethanol precipitation they yielded two batches of DNA which are suitable for sequencing. Fragment size is about 19 kb, which is good for PacBio sequencing, and purity was assessed by absorbance ratios: 260/380 = 1.9 and 260/230= 2.2.
Our first session of bioinformatics simply consists in checking the assembly with simple visualization and quality-control. We only work with one strain, because the sequences show that they are in fact identical.
We started with a rapid crowd-sourcing of what the students expect of a bacterial genome:
- a single chromosome
- few introns
- circular chromosome
- small genome
- an origin and a terminus of replication
- presence of operons
- presence of plasmids
- species-specific GC content.
A pretty good list. As it turns out, our PacBio sequences assembled into one conting of 6.778 Mb. Thus we have an average sized genome, no plasmids, and an apparently perfect assembly.
Using Artemis, the students have been able to manipulate a little bit the genome and check its GC skew: all is well.
The students have launched mapping of PacBio shorter reads onto the genome to check distribution of reads, but results on the cluster will take a few more days… So that’s it for now.