Arylformamidase Sequence & Homology: Difference between revisions

From MDWiki
Jump to navigationJump to search
No edit summary
 
(36 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Our query sequence "Arylformamidase" is a putatuve thioesterase isloated from a ''Silicibacter sp.'' These organisms are best known for their ability to degrade sulfur compounds in the marine environment. The query sequence sequence Target 13, pdb:2pbl is 262 residues in length.


Our query sequence "Arylformamidase" is a putatuve thioesterase isloated from a ''Silicibacter sp.'' The sequence is 262 residues in length.
== Method ==


== 1. Method ==


Using the query sequence 2PBL,a [http://www.ncbi.nlm.nih.gov/blast/Blast.cgi NCBI: BLASTP] search was performed on the bacterial protein sequence using a non-redundant database. The top scoring matches to an E-value of 3e-054, 35 sequences in total, were selected. Eukaryotic homologous sequences sequences were found using [http://www.ncbi.nlm.nih.gov/sites/entrez?itool=protein_brief&DbFrom=protein&Cmd=Link&LinkName=protein_homologene&IdsFromResult=58330909 NCBI HomoloGene]. These were appended to the list and a multple sequece alignment was performed using [http://www.clustal.org/ CLUSTALX].


Using the query sequence Target 13, pdb:2pbl "Arylformamidase", a BLAST search was performed on the bacterial protein sequence ([[Arylformamidase]]) using a non-redundant database. The top scoring matches to an E-value of 3e-054, 35 sequences in total, were selected. Eukaryotic homologous sequences sequences were found using HOMOLOGENE. These were appended to the list and a multple sequece alignment was performed using CLUSTAL X.  
The data output from the multiple sequence alignment was bootstrapped 1000 times and a phylogenetic tree was created using the neighbour-joining algorythm. The program [http://tree.bio.ed.ac.uk/software/figtree/ Fig Tree] was used to create the visual representation of this tree (Figure 1).


The data output from the multiple sequence alignment was bootstrapped 1000 times and a phylogenetic tree was created using the neighbour-joining algorythm. The program FigTree was used to create the visual representation of this tree(Figure 1).
Top scoring sequences from the results of a BLASTP search using the human homologue of 2PBL were appended to the original top scoring sequences of the results BLASTP search on the bacterial query sequence. As above, using CLUSTAL X, a multiple sequence alignment was generated, the data was then bootstrapped 1000 times and a phylogenetic tree generated using the neighbour-joining algorythm (Figure 2).


A similar BLAST search was performed using the human homologue to our query sequence. 126 of the top scoring matches were selected for a multiple sequence alignment. This was the minimum number of sequences which would also include the query sequence. The sequences were aligned, bootstrapped and a tree created as above. The tree revealed some questionable matches, joining humans with pufferfish for instance, which, whilst evolutionarily interesting poses more questions than answers.
== Results ==
 
Figure 1 shows that the query sequence "Arylformamidase" grouped with bacterial sequences, shown cloured in Blue. The bootstrap values reveal low confidence with many of the nodes occurring lower down on the phylogenetic tree revealing a possible explanation for certain closely related species to be grouped into separate clades. However, despite low bootstrap scores, the grouping does reliably separate prokaryotes from eukaryotes and the eukaryotes themsselves are clearly distinguished between yeasts and moulds (shown in Green), plants (Dark Green), invertebrates (Orange) and vertebrates (shown in Red).




Line 16: Line 19:
[[Image:NewBOOT1000tree.png]]
[[Image:NewBOOT1000tree.png]]


''Unrooted phylogenetic tree of highest scoring results from a BLAST search of bacterial sequnces using a non-redundant database and homologous eukaryotic sequences sourced from HOMOLOGENE.''
''Unrooted phylogenetic tree of highest scoring results from a BLASTP search of bacterial sequnces using a non-redundant database and homologous eukaryotic sequences sourced from NCBI HomoloGene. Branch lengths are related to phylogenetic distance and node numbers refer to Bootstrap values. On this tree "Arylformamidase" refers to the Silicibacter species from which our sequence originated. The colour coding distinguishes prokaryotic organisms shown in Blue, from eukaryote yeasts and moulds (shown in Green), plants (Dark Green), invertebrates (Orange) and vertebrates (shown in Red).''
 
To further elucidate the phylogeny of 2pbl, its human homologue, Arylformamidase, was queried in a BLAST search. The top scoring matches of bacterial homologues, present in Figure 1, were appended with top scoring matches of eukaryotic homologues (figure 2). The human homologue, Arylformamidase, has a 26.28% sequence similarity. Despite this low score, multiple sequence alignment revealed that key regions were highly conserved between bacterial and eukayotic homologues. 
 
'''Figure 2.'''
 
'''Multiple Sequence alignment:'''


Top scoring sequences from the results of the BLAST search using the human homologue were appended to the original top scoring sequences of the results BLAST search on the bacterial query sequence.  
[[Image:Alignment1.jpg]]


As above, using CLUSTAL X, a multiple sequence alignment was generated, the data was then bootstrapped 1000 times and a phylogenetic tree generated using the neighbour-joining algorythm (Figure 2).
''Sections of alignment showing the conserved residues across bacterial and eukaryotic species.''


'''Figure 2.'''


[[Image:BacterANDhomoTREE.jpg]]
Figure 3 is largely consistent with traditional taxonomic groupings of organisms. Specifically, it reveals greater statistical confidence in the separation of prokaryotes (Blue and Green) and eukaryotes (invertebrates are shown in Orange; vertebrates are in Red).


''Unrooted phylogenetic tree of highest scoring results from a BLAST search of bacterial sequences and highest scoring results of a BLAST search on a homologous human sequence.''


== 2. Results ==
'''Figure 3.'''


Figure 1 shows that the query sequence "Arylformamidase" grouped with bacterial sequences, shown cloured in Blue. The bootstrap values reveal low confidence with many of the nodes occurring lower down on the phylogenetic tree revealing a possible explanation for certain closely related species to be grouped into separate clades. However, despite low bootstrap scores, the grouping does reliably separate prokaryotes from eukaryotes and the eukaryotes themsselves are clearly distinguished between yeasts and moulds (shown in Green), plants (Dark Green), invertebrates and vertebrates (shown in Red).
[[Image:BacterANDhomoTREE.jpg]]


To further elucidate the phylogeny of the Arylformamidase protein, top scoring matches of bacterial homologues were appended with top scoring matches of eukaryotic homologues. Figure 2 reveals greater statistical confidence in the separation of prokaryotes (Blue and Green) and eukaryotes (invertebrates are shown in Orange; vertebrates are in Red).
''Unrooted phylogenetic tree of highest scoring results from a BLASTP search of bacterial sequences and highest scoring results of a BLASTP search on a homologous human sequence. Branch lengths are related to phylogenetic distance and node numbers refer to Bootstrap values. On this tree "Arylformamidase" refers to the Silicibacter species from which our sequence originated. The colour coding distinguishes prokaryotes (Blue and Green) and eukaryotes (invertebrates are shown in Orange; vertebrates are in Red).''


The alignment revealed several conserved regions accross all species, thereby indicating a high level of conservation from Bacteria through Eukaryota. Most significantly, the catalytic triad of 137S, 215E/D and 242H are conserved accross all species of prokaryotes and eukaryotes. These included vertebrates, invertebrates, yeasts, moulds and single-celled eukaryotes. The catalytic triad is thought to be involved in carboxylesterase activity.
In general, members of the same genus have been grouped together on these phylogenetic trees with some notable exceptions. For instance, Silicibacter, the species from which we derived our protein, occurs on disparate branches of the tree.


Based on the phylogenetic information provided in Figure 2, we can see that the Silicibacter species from which our sequence originated is one of ancient lineage.
== Discussion ==


The multiple sequence alignment revealed several conserved regions accross all species, thereby indicating a high level of conservation from Bacteria through Eukaryota. Most significantly, the catalytic triad of Ser137, Glu215 and His242 and many associated residues which occur in the same structural area of the protein are conserved accross all species of prokaryotes and eukaryotes. This may therefore be indicative of the conservation of functional group of residues within the protein. These included vertebrates, invertebrates, yeasts, moulds and single-celled eukaryotes. The catalytic triad is thought to be involved in thioesterase/carboxylesterase activity though the function of the protein may show variation between species.


Given that the phylogeny of our protein is largely consistent with traditional taxonomic groupings of organisms and that we can find no evidence of horizontal gene transfer, the delineations between prokariotic and eukaryotic species alow us to infer that the dominant mode of inheritance is clonal from bacteria to plantae and animalia.


== References ==
== References ==


[http://en.wikipedia.org/wiki/ClustalW Wikipedia: CLUSTAL]
[http://www.ncbi.nlm.nih.gov/blast/Blast.cgi NCBI: BLAST]
 
[http://www.ncbi.nlm.nih.gov/sites/entrez?itool=protein_brief&DbFrom=protein&Cmd=Link&LinkName=protein_homologene&IdsFromResult=58330909 NCBI HomoloGene: Arylformamidase]
 
[http://www.clustal.org/ CLUSTAL Homepage]


[http://tree.bio.ed.ac.uk/software/figtree/ Fig Tree]
[http://tree.bio.ed.ac.uk/software/figtree/ Fig Tree]

Latest revision as of 02:06, 10 June 2008

Our query sequence "Arylformamidase" is a putatuve thioesterase isloated from a Silicibacter sp. These organisms are best known for their ability to degrade sulfur compounds in the marine environment. The query sequence sequence Target 13, pdb:2pbl is 262 residues in length.

Method

Using the query sequence 2PBL,a NCBI: BLASTP search was performed on the bacterial protein sequence using a non-redundant database. The top scoring matches to an E-value of 3e-054, 35 sequences in total, were selected. Eukaryotic homologous sequences sequences were found using NCBI HomoloGene. These were appended to the list and a multple sequece alignment was performed using CLUSTALX.

The data output from the multiple sequence alignment was bootstrapped 1000 times and a phylogenetic tree was created using the neighbour-joining algorythm. The program Fig Tree was used to create the visual representation of this tree (Figure 1).

Top scoring sequences from the results of a BLASTP search using the human homologue of 2PBL were appended to the original top scoring sequences of the results BLASTP search on the bacterial query sequence. As above, using CLUSTAL X, a multiple sequence alignment was generated, the data was then bootstrapped 1000 times and a phylogenetic tree generated using the neighbour-joining algorythm (Figure 2).

Results

Figure 1 shows that the query sequence "Arylformamidase" grouped with bacterial sequences, shown cloured in Blue. The bootstrap values reveal low confidence with many of the nodes occurring lower down on the phylogenetic tree revealing a possible explanation for certain closely related species to be grouped into separate clades. However, despite low bootstrap scores, the grouping does reliably separate prokaryotes from eukaryotes and the eukaryotes themsselves are clearly distinguished between yeasts and moulds (shown in Green), plants (Dark Green), invertebrates (Orange) and vertebrates (shown in Red).


Figure 1.

NewBOOT1000tree.png

Unrooted phylogenetic tree of highest scoring results from a BLASTP search of bacterial sequnces using a non-redundant database and homologous eukaryotic sequences sourced from NCBI HomoloGene. Branch lengths are related to phylogenetic distance and node numbers refer to Bootstrap values. On this tree "Arylformamidase" refers to the Silicibacter species from which our sequence originated. The colour coding distinguishes prokaryotic organisms shown in Blue, from eukaryote yeasts and moulds (shown in Green), plants (Dark Green), invertebrates (Orange) and vertebrates (shown in Red).

To further elucidate the phylogeny of 2pbl, its human homologue, Arylformamidase, was queried in a BLAST search. The top scoring matches of bacterial homologues, present in Figure 1, were appended with top scoring matches of eukaryotic homologues (figure 2). The human homologue, Arylformamidase, has a 26.28% sequence similarity. Despite this low score, multiple sequence alignment revealed that key regions were highly conserved between bacterial and eukayotic homologues.

Figure 2.

Multiple Sequence alignment:

Alignment1.jpg

Sections of alignment showing the conserved residues across bacterial and eukaryotic species.


Figure 3 is largely consistent with traditional taxonomic groupings of organisms. Specifically, it reveals greater statistical confidence in the separation of prokaryotes (Blue and Green) and eukaryotes (invertebrates are shown in Orange; vertebrates are in Red).


Figure 3.

BacterANDhomoTREE.jpg

Unrooted phylogenetic tree of highest scoring results from a BLASTP search of bacterial sequences and highest scoring results of a BLASTP search on a homologous human sequence. Branch lengths are related to phylogenetic distance and node numbers refer to Bootstrap values. On this tree "Arylformamidase" refers to the Silicibacter species from which our sequence originated. The colour coding distinguishes prokaryotes (Blue and Green) and eukaryotes (invertebrates are shown in Orange; vertebrates are in Red).

In general, members of the same genus have been grouped together on these phylogenetic trees with some notable exceptions. For instance, Silicibacter, the species from which we derived our protein, occurs on disparate branches of the tree.

Discussion

The multiple sequence alignment revealed several conserved regions accross all species, thereby indicating a high level of conservation from Bacteria through Eukaryota. Most significantly, the catalytic triad of Ser137, Glu215 and His242 and many associated residues which occur in the same structural area of the protein are conserved accross all species of prokaryotes and eukaryotes. This may therefore be indicative of the conservation of functional group of residues within the protein. These included vertebrates, invertebrates, yeasts, moulds and single-celled eukaryotes. The catalytic triad is thought to be involved in thioesterase/carboxylesterase activity though the function of the protein may show variation between species.

Given that the phylogeny of our protein is largely consistent with traditional taxonomic groupings of organisms and that we can find no evidence of horizontal gene transfer, the delineations between prokariotic and eukaryotic species alow us to infer that the dominant mode of inheritance is clonal from bacteria to plantae and animalia.

References

NCBI: BLAST

NCBI HomoloGene: Arylformamidase

CLUSTAL Homepage

Fig Tree

MicrobeWiki: Silicibacter pomeroyi


Return to Arylformamidase