Evolution ERp18: Difference between revisions
John O'Bryen (talk | contribs) No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
This page discusses the evolution of the target protein, ''Endoplasmic reticulum thioredoxin | This page discusses the evolution of the target protein, ''Endoplasmic reticulum thioredoxin super family member''. | ||
[http://compbio.chemistry.uq.edu.au/mediawiki/index.php/Endoplasmic_reticulum_thioredoxin_superfamily_member Back] | [http://compbio.chemistry.uq.edu.au/mediawiki/index.php/Endoplasmic_reticulum_thioredoxin_superfamily_member Back] | ||
Line 5: | Line 5: | ||
==Methods== | ==Methods== | ||
To generate a collection of sequences which were, apparently, related to the target protein (ERp18), a PSI-Blast search was conducted. PSI-BLAST is advantageous as it uses an iterative approach whereby selected, | To generate a collection of sequences which were, apparently, related to the target protein (ERp18), a PSI-Blast search was conducted. PSI-BLAST is advantageous as it uses an iterative approach whereby selected, relevant results from previous searches are used to inform the next search operation. The PSI-BLAST method was particularly useful in this circumstance since ERp18 is part of a super family of proteins and so has many homologues which have high identity scores and low e-values but are actually different proteins. | ||
Line 25: | Line 25: | ||
===Protein Families=== | ===Protein Families=== | ||
The defining motif for a protein in the thioredoxin protein | The defining motif for a protein in the thioredoxin protein super family is the CXXC motif, with the two cysteines representing the catalytic residues. It has been suggested in literature the the 'wild card' amino acids between the cysteines, in part, dictate the specific molecular function (see [[Function ERp18]]). | ||
This statement is supported in part, by examining the evolution of proteins which have high identities to the target protein in different organisms. Examining the alignment shows the proteins are varied in the critical CXXC residues however they are likley to be related. | This statement is supported in part, by examining the evolution of proteins which have high identities to the target protein in different organisms. Examining the alignment shows the proteins are varied in the critical CXXC residues however they are likley to be related. | ||
Line 32: | Line 32: | ||
[[Image:Sequence1-cml.jpg|thumb|'''Fig. 1''' Evolutionary analysis of different Thioredoxin proteins showing different subfamilies]] | [[Image:Sequence1-cml.jpg|thumb|'''Fig. 1''' Evolutionary analysis of different Thioredoxin proteins showing different subfamilies]] | ||
Figure 1 shows the CLUSTAL-W multiple sequence alignment for a selection of proteins which are annotated as 'Thioredoxin' or 'Thioredoxin Like' proteins. Examination of the key functional residues shows that there are at least three classes in the family, containing the CGAC, CHWC, CHHS motifs. CHHS clearly does not conform to the requirement for being a thioredoxin protein as it does not contain the CXXC motif, yet is has the stated annotation. The title 'Thioredoxin-like' may refer to the overall similarity in the sequences rather than the | Figure 1 shows the CLUSTAL-W multiple sequence alignment for a selection of proteins which are annotated as 'Thioredoxin' or 'Thioredoxin Like' proteins. Examination of the key functional residues shows that there are at least three classes in the family, containing the CGAC, CHWC, CHHS motifs. CHHS clearly does not conform to the requirement for being a thioredoxin protein as it does not contain the CXXC motif, yet is has the stated annotation. The title 'Thioredoxin-like' may refer to the overall similarity in the sequences rather than the specific domain. | ||
Another observation from figure 1, is the convervation of the CXXC motif between higher organisms (Homo Sapiens, Salmo Salar, etc) and bacteria and archea. Bacteria and Archea contain the motif 'CHWC', in place of 'CGAC' which is seen in higher organisms. This suggests that the ERp18 protein has evolved from bacteria or archea. It is unlikley that lateral gene transfer has occured, however, it cannot be ruled out based on this evidence. | Another observation from figure 1, is the convervation of the CXXC motif between higher organisms (Homo Sapiens, Salmo Salar, etc) and bacteria and archea. Bacteria and Archea contain the motif 'CHWC', in place of 'CGAC' which is seen in higher organisms. This suggests that the ERp18 protein has evolved from bacteria or archea. It is unlikley that lateral gene transfer has occured, however, it cannot be ruled out based on this evidence. | ||
Line 44: | Line 44: | ||
===Phylogenetic Trees=== | ===Phylogenetic Trees=== | ||
Although there appear to be many thioredoxin and thioredoxin like proteins, there are very few which have been located in the endoplasmic reticulum. This specific protein, based on the sequences available, appears to be unique to | Although there appear to be many thioredoxin and thioredoxin like proteins, there are very few which have been located in the endoplasmic reticulum. This specific protein, based on the sequences available, appears to be unique to multi-cellular eukaryotes. | ||
From the sequence alignment, it is seen that the 'CGAC' domain in conserved as is the | From the sequence alignment, it is seen that the 'CGAC' domain in conserved as is the approximate length of the protein sequences. The two following figures show how the ER thioredoxin proteins are related to each other. These trees are 'Neighbour-Joining' Trees and in this circumstance radial, unrooted trees are not appropriate as they add no new information to the analysis. | ||
[[Image:Tree-CGAConly-cml.jpg|center|thumb|600px|'''Fig. 3''' [1-4]]] | [[Image:Tree-CGAConly-cml.jpg|center|thumb|600px|'''Fig. 3''' [1-4]]] | ||
Line 58: | Line 58: | ||
===Apparent Human Homologues=== | ===Apparent Human Homologues=== | ||
When conducting BLAST searches for the target protein, two other | When conducting BLAST searches for the target protein, two other proteins appear as 'related'. These are the 'Breast Cancer Membrane Protein' and the 'Anterior Gradient 2 Homologue'. By comparing these two proteins against other proteins which also have high identity to the target is can be seen these two proteins are related to the target. | ||
[[image:Humanproteins-cml.jpg|center|thumb|600px|'''Fig. 5 Evolutionary relationships of 8 taxa ''' [1-4]. ]] | [[image:Humanproteins-cml.jpg|center|thumb|600px|'''Fig. 5 Evolutionary relationships of 8 taxa ''' [1-4]. ]] | ||
Line 68: | Line 68: | ||
===Possible Convergent Evolution=== | ===Possible Convergent Evolution=== | ||
If this analysis is true and ERp18 has evolved from the Breast Cancer Membrane Protein then the CXXC motif in ERp18 may be an example of convergent evolution. Further analysis will be needed to determine if all thioredoxin proteins evolved from a single common ancestor or if the CXXC motif has evolved convergently in different proteins. Superficially, CXXC is a simple motif and statistically it may occur with | If this analysis is true and ERp18 has evolved from the Breast Cancer Membrane Protein then the CXXC motif in ERp18 may be an example of convergent evolution. Further analysis will be needed to determine if all thioredoxin proteins evolved from a single common ancestor or if the CXXC motif has evolved convergently in different proteins. Superficially, CXXC is a simple motif and statistically it may occur with relatively high frequency. The question needs to be asked; 'Is CXXC alone, enough to confer function to a protein?' and 'If not, what other protein features are required?' | ||
Line 74: | Line 74: | ||
==Conclusion== | ==Conclusion== | ||
From the analysis and discussion above, it can be suggested that the ERp18 protein is unique to | From the analysis and discussion above, it can be suggested that the ERp18 protein is unique to multi-celluar organisms and in that group is quite highly conserved. What also may be concluded is that the ERp18 is part of a unique family of proteins which belongs to a larger 'super family' of thioredoxin proteins and contains the specific active site motif CGAC motif. Further analysis is needed to dertermine proteins containing the CXXC active motif have a common ancestor, or if the motif has evolved convergently in several classes of protein. | ||
[http://compbio.chemistry.uq.edu.au/mediawiki/index.php/Endoplasmic_reticulum_thioredoxin_superfamily_member Back] | [http://compbio.chemistry.uq.edu.au/mediawiki/index.php/Endoplasmic_reticulum_thioredoxin_superfamily_member Back] |
Latest revision as of 23:26, 15 June 2009
This page discusses the evolution of the target protein, Endoplasmic reticulum thioredoxin super family member.
Methods
To generate a collection of sequences which were, apparently, related to the target protein (ERp18), a PSI-Blast search was conducted. PSI-BLAST is advantageous as it uses an iterative approach whereby selected, relevant results from previous searches are used to inform the next search operation. The PSI-BLAST method was particularly useful in this circumstance since ERp18 is part of a super family of proteins and so has many homologues which have high identity scores and low e-values but are actually different proteins.
The collected sequences were analysed using the MEGA4 suite. This program packages allows:
- multiple sequence alignment,
- phylogenetic tree generation,
- bootstrapping, and,
- viewing and editing of phylogenetic trees
All of which were important to this assignment.
The Dayhoff Matrix was used to generate phylogenetic trees throughout this analysis. All bootstrap exercises used 1000 replicates.
Results and Discussion
Protein Families
The defining motif for a protein in the thioredoxin protein super family is the CXXC motif, with the two cysteines representing the catalytic residues. It has been suggested in literature the the 'wild card' amino acids between the cysteines, in part, dictate the specific molecular function (see Function ERp18).
This statement is supported in part, by examining the evolution of proteins which have high identities to the target protein in different organisms. Examining the alignment shows the proteins are varied in the critical CXXC residues however they are likley to be related.
Figure 1 shows the CLUSTAL-W multiple sequence alignment for a selection of proteins which are annotated as 'Thioredoxin' or 'Thioredoxin Like' proteins. Examination of the key functional residues shows that there are at least three classes in the family, containing the CGAC, CHWC, CHHS motifs. CHHS clearly does not conform to the requirement for being a thioredoxin protein as it does not contain the CXXC motif, yet is has the stated annotation. The title 'Thioredoxin-like' may refer to the overall similarity in the sequences rather than the specific domain.
Another observation from figure 1, is the convervation of the CXXC motif between higher organisms (Homo Sapiens, Salmo Salar, etc) and bacteria and archea. Bacteria and Archea contain the motif 'CHWC', in place of 'CGAC' which is seen in higher organisms. This suggests that the ERp18 protein has evolved from bacteria or archea. It is unlikley that lateral gene transfer has occured, however, it cannot be ruled out based on this evidence.
Figure 2 shows in detail the relationship between the three apparent classes of proteins. In both groups of 'higher' organisms the similar trends can be seen where the proteins in more complex organisms are further from the common ancestor.
Phylogenetic Trees
Although there appear to be many thioredoxin and thioredoxin like proteins, there are very few which have been located in the endoplasmic reticulum. This specific protein, based on the sequences available, appears to be unique to multi-cellular eukaryotes.
From the sequence alignment, it is seen that the 'CGAC' domain in conserved as is the approximate length of the protein sequences. The two following figures show how the ER thioredoxin proteins are related to each other. These trees are 'Neighbour-Joining' Trees and in this circumstance radial, unrooted trees are not appropriate as they add no new information to the analysis.
ERp18 seems to be well conserved within the group or organisms and shows little evolutionary distance between species.
Apparent Human Homologues
When conducting BLAST searches for the target protein, two other proteins appear as 'related'. These are the 'Breast Cancer Membrane Protein' and the 'Anterior Gradient 2 Homologue'. By comparing these two proteins against other proteins which also have high identity to the target is can be seen these two proteins are related to the target.
There is quite a large distance between ERp18 and the 'Breat Cancer Membrane Protein' and the 'Anterior Gradient 2 Homologue'. However, it is clear that they have a common ancestor and that ERp18 and the 'Anterior Gradient 2 homologue' have a common ancestor in the 'Breast Cancer Membrane Protein'. It is important to note that the 'Breat Cancer Membrane Protein' and the 'Anterior Gradient 2 Homologue' do not contain the CXXC motif.
Possible Convergent Evolution
If this analysis is true and ERp18 has evolved from the Breast Cancer Membrane Protein then the CXXC motif in ERp18 may be an example of convergent evolution. Further analysis will be needed to determine if all thioredoxin proteins evolved from a single common ancestor or if the CXXC motif has evolved convergently in different proteins. Superficially, CXXC is a simple motif and statistically it may occur with relatively high frequency. The question needs to be asked; 'Is CXXC alone, enough to confer function to a protein?' and 'If not, what other protein features are required?'
Conclusion
From the analysis and discussion above, it can be suggested that the ERp18 protein is unique to multi-celluar organisms and in that group is quite highly conserved. What also may be concluded is that the ERp18 is part of a unique family of proteins which belongs to a larger 'super family' of thioredoxin proteins and contains the specific active site motif CGAC motif. Further analysis is needed to dertermine proteins containing the CXXC active motif have a common ancestor, or if the motif has evolved convergently in several classes of protein.
References
- Saitou N & Nei M (1987) "The neighbor-joining method: A new method for reconstructing phylogenetic trees." Molecular Biology and Evolution 4:406-425.
- Schwarz R & Dayhoff M (1979) "Matrices for detecting distant relationships." In Dayhoff M, editor, Atlas of protein sequences, pages 353 - 58. National Biomedical Research Foundation.
- Tamura K, Dudley J, Nei M & Kumar S (2007) "MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0." Molecular Biology and Evolution 24:1596-1599.
- Felsenstein J (1985) "Confidence limits on phylogenies: An approach using the bootstrap." Evolution 39:783-791.
- Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.
- Alejandro A. Schäffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005.