Last session of annotation: using literature, KEGG, Blast and UniProt (mostly)

Following last week’s summary of some student annotation topics, here are some more. I didn’t have time to interview all the students, so some topics were not yet covered, and some only very rapidly, but it gives an idea of what we’re up to.

Motility: swarming, swimming, twitching: The students searched the flagellin gene first, and found that it was well annotated automatically. They then annotated neighboring genes on the chromosome, and checked the literature and KEGG for flagella proteins. They found a cluster of genes automatically named with the prefix fli*, near the flagellin hag gene. Through KEGG, they also identified flg genes, which are in a different cluster in the genome. Both clusters have nearby chemotaxis genes. An intringuing observation which was not yet solved at time of writing is that motA and motB genes both appear duplicated to another region of the genome, but the annotation is unclear; could there be two flagella in this species? Such a configuration seems to exist in other species, notably for swarming within biofilms. The students also found pilus / twitching genes, as well as two genes involved in swarming (starting from annotations in the Pseudomonas database).

Organic acid metabolism: Starting with a literature review (Badri & Vivanco 2009), the students found organic acids which are secreted by the plant and can be taken up by Pseudomonas. They then looked in KEGG for pathways which can use these compounds, and found two main pathways in Pseudomonas: TCA and Glyoxylate and dicarboxylate metabolism (and pyruvate metabolism, but they gave it lower priority and didn’t have time to annotate it). After checking annotation of these two pathways, they looked for uptake of organic acids, starting from the literature plus KEGG again. They found three uptake systems, and are annotating their regulation.

Metabolism pathways to the TCA, from KEGG

Metabolism pathways to the TCA, from KEGG

Quorum sensing/Gac system: Started with a literature search for the system GacA/GacS, the students found an article (Cheung et al 2013) with a transcriptome study of this system, and associated genes. They then selected the most upregulated genes in response to the KO of GacS, and identified them in our genome: they are organised in clusters as expected, conserved with the observations in Pseudomonas fluorescens. They then identified the two small RNAs which regulate these clusters.

From pseudomonas.com (click to go to original)

From pseudomonas.com (click to go to original)

Secondary metabolite production: phloroglucinols (DAPG), phenazines, hydrogen cyanide: The students checked the annotation of genes involved in the production of these three classes of compounds. They notably found a candidate gene cluster for the biosynthesis of phloroglucinols, confirmed it by BLAST, and annotated its regulation.

Polysaccharide synthesis: From a literature search, the students found well characterized genes in other species. They checked these gene names in UniProtKB and KEGG, compared them to our genome by BLAST, and found interesting potential operons. Notably, they found potential small regulatory genes in front of one operon, which were not in the expected conserved genes. They also found an intriguing massive cluster of 20 genes, with two functions: polysaccharide synthesis or lipopolysaccharide synthesis.

Type VI secretion: The students started from KEGG, and found good BLAST hits in our genome. The found common genes with the genomic island annotation group (see previous post), but the overlap is only partial, which is intriguing: the secretion system was expected to be entirely inside the genomic island, or outside it. They also found regulatory genes for the secretion system, which they are annotating.

That’s it for now. Of course, it’s a work in progress, and an account based on 5 min interviews with students who are working on their annotation. It gives an idea of the biology that we can start to extract from a genome with a class of master students, in a few hours. Next week the students will present their results in formal presentations to the whole class.

This entry was posted in annotation, bioinformatics, students. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *