Evolution ERp18

From MDWiki
Jump to navigationJump to search

This page discusses the evolution of the target protein, Endoplasmic reticulum thioredoxin superfamily member.


Methods

To generate a collection of sequences which were, apparently, related to the target protein (ERp18), a PSI-Blast search was conducted. PSI-BLAST is advantageous as it uses an iterative approach whereby selected, relavent results from previous searches are used to inform the next search operation. The PSI-BLAST method was particularly useful in this circumstance since ERp18 is part of a superfamily of proteins and so has many homologues which have high identity scores and low e-vaulues but are actually different proteins.


The collected sequences were analysed using the MEGA4 suite. This program packages allows:

  • multiple sequence alignment,
  • phylogenetic tree generation,
  • bootstrapping, and,
  • viewing and editing of phylogenetic trees

All of which were important to this assignment.


The Dayhoff Matrix was used to generate phylogenetic trees throughout this analysis. All bootstrap exercises used 1000 replicates.

Results

Protein Families

The defining motif for a protein in the thioredoxin protein superfamily is the CXXC motif, with the two cystines representing the catalytic residues. It has been suggested in literature the the 'wild card' proteins between the cystines, in part, dictact the specific molecular function (see Function ERp18).

This statement is supported in part, by examining the evolution of proteins which have high identities to the target protein in different organisms. Examining the alignment shows the proteins are varied in the critical CXXC residues however they are likley to be related.


Fig. 1 Evolutionary analysis of different Thioredoxin proteins showing different subfamilies

Figure 1 shows the CLUSTAL-W multiple sequence alignment for a selection of proteins which are annotated as 'Thioredoxin' or 'Thioredoxin Like' proteins. Examination of the key functional residues shows that there are at least three classes in the family, containing the CGAC, CHWC, CHHS motifs. CHHS clearly does not conform to the requirement for being a thioredoxin protein as it does not contain the CXXC motif, yet is has the stated annotation. The title 'Thioredoxin-like' may refer to the overall similarity in the sequences rather than the spefic domain.

Another observation from figure 1, is the convervation of the CXXC motif between higher organisms (Homo Sapiens, Salmo Salar, etc) and bacteria and archea. Bacteria and Archea contain the motif 'CHWC', in place of 'CGAC' which is seen in higher organisms. This suggests that the ERp18 protein has evolved from bacteria or archea. It is unlikley that lateral gene transfer has occured, however, it cannot be ruled out based on this evidence.

Figure 2 shows in detail the relationship between the three apparent classes of proteins. In both groups of 'higher' organisms the similar trends can be seen where the proteins in more complex organisms are further from the common ancestor.

Fig. 2Unrooted phylogenetic tree showing homologies of different proteins lablled as 'Thioredoxins'


Phylogenetic Trees

Although there appear to be many thioredoxin and thioredoxin like proteins, there are very few which have been located in the endoplasmic reticulum. This specific protein, based on the sequences available, appears to be unique to multicellular eukaryotes.

From the sequence alignment, it is seen that the 'CGAC' domain in conserved as is the approxmiate length of the protein sequences. The two following figures show how the ER thioredoxin proteins are related to each other. These trees are 'Neighbour-Joining' Trees and in this circumstance radial, unrooted trees are not appropriate as they add no new information to the analysis.

Fig. 3
Fig. 4

ERp18 seems to be well conserved within the group or organisms and shows little evolutionary distance between species.

Apparent Human Homologues

When conducting BLAST searches for the target protein, two other protiens appear as 'related'. These are the 'Breat Cancer Membrane Protein' and the 'Anterior Gradient 2 Homologue'. By comparing these two proteins against other proteins which also have high identity to the target is can be seen these two proteins are related to the target.

Fig. 5 Evolutionary relationships of 8 taxa The evolutionary history was inferred using the Neighbor-Joining method [1]. The optimal tree with the sum of branch length = 7.25526990 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches [4]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Dayhoff matrix based method [2] and are in the units of the number of amino acid substitutions per site. All positions containing gaps and missing data were eliminated from the dataset (Complete deletion option). There were a total of 62 positions in the final dataset. Phylogenetic analyses were conducted in MEGA4 [3].

There is quite a lot of evolutionary distance between ERp18 and the 'Breat Cancer Membrane Protein' and the 'Anterior Gradient 2 Homologue'. However, it is clear that they have a common ancestor and that ERp18 and the 'Anterior Gradient 2 homologue' have a common ancestor in the 'Breast Cancer Membrane Protein'. It is important to note that the 'Breat Cancer Membrane Protein' and the 'Anterior Gradient 2 Homologue' do not contain the CXXC motif.

Discussion

From the analysis and discussion above, it can be suggested that the ERp18 protein is unique to multicelluar organisms and in that group is quite highly conserved. What also may be concluded is that the ERp18 is part of a unique familiy of proteins which belongs to a larger 'superfamily' of thioredoxin proteins and contain the specific CGAC motif.

References

  1. . Saitou N & Nei M (1987) The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4:406-425.
  2. . Schwarz R & Dayhoff M (1979) Matrices for detecting distant relationships. In Dayhoff M, editor, Atlas of protein sequences, pages 353 - 58. National Biomedical Research Foundation.
  3. . Tamura K, Dudley J, Nei M & Kumar S (2007) MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular Biology and Evolution 24:1596-1599.
  4. . Felsenstein J (1985) Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791.