Finishing PacBio assembly, and a presentation by our local PacBio representative

In this first session of the course “Sequence a genome” since we sent DNA to the sequencing facility, we have had the visit of Gerrit Kuhn, who supports PacBio in our area, and who gave a nice presentation of how PacBio works, the workflow, and of the specific requirements during sample preparation and how it influences data quality. While the sleek animations are expected from a corporate presentation, kudos to Gerrit for being open and giving insight into the advantages and specificities of the PacBio workflow, and answering all questions straightforwardly. (Updated paragraph.)

Gerrit Kuhn of PacBio Switzerland

Gerrit Kuhn of PacBio Switzerland talking to our students.

Gerrit Kuhn of PacBio Switzerland

Gerrit Kuhn of PacBio Switzerland talking some more to our students.

After our first pass of PacBio, we have 9 contigs, with the longest at 5’815’706 bp. We have asked the students to use Mauve to look at our assembly, and compare it to a version of the Pseudomonas veronii genome in NCBI, which has 63 contigs.

screenshot of Mauve comparison of our contigs with the NCBI genome

Screenshot of Mauve comparison of our contigs (top) with the NCBI genome (bottom). Red lines separate contigs, colored blocks are recognized as similar between the two genomes.

We have two very large contigs, unitig-3 and -5, with strong similarity to the NCBI genome, and some other contigs with almost none. Notice the spaghetti of relations between contig blocks, due to the fragmented assembly in the NCBI genome. We also compared our contigs between themselves. We are thus able to eliminate 3 contigs which are small and entirely redundant with larger contigs within our assembly. We can also find that the two largest contigs have enough overlap to join them, which provides us a main chromosome of 6.8 Mb. 🙂 We are not able to circularize it, though. Two other groups of contigs can be joined into additional molecules, plasmids or secondary chromosomes, to be determined at annotation (in a few weeks). A group of 3 contigs groups into an apparent mega-plasmid, which we can circularize thanks to similarity at the ends of the contigs; this also has high similarity to NCBI contigs. The other potential plasmid, formed of two contigs which cannot circularize, has no similarity to the NCBI sequence, and ends with potential transposons (sequences found elsewhere in the genome).

Thus overall, the PacBio experiment seems to have worked: the DNA extracted by our students and sequenced in our facility has produced a usable assembly. In the next weeks we will annotate this chromosome and these two plasmids.

This entry was posted in assembly. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *