Discussion - 2qgnA
The crystal structure of tRNA isopentenyl transferase isolated from Bacillus halodurans was determined, allowing us to further analyse different aspects of this structure. This has enabled us to shed light on the structural composition and mechanism of action of this essential enzyme. Structural analysis of 2qgnA has resulted in the finding of several different aspects of tRNA-isopentenyl transferase. Beginning with the structure constructed from PyMOL(Figure 4)consisting of 322 amino acid, we have looked into the secondary structure, its surface properties covering positively charged regions together with negatively charged regions(Figure 5.2), and its domains as well as ligand binding sites and surface clefts, also conservation of residues across different species visually portrayed. Ligplots and protein-ligand interaction of tRNA isopentenyl were scrutinized, identifying hydrophobic and hydrophilic bonds between the protein and the ligand. Residues surrounding the ligand found to be crucial include Ala(13),Gly(15),Val(14),Lys(16) and Thr(17).
From CATH domain database, two main domains were found in 2qgn. The first domain ranges from residue 2-200 and residue 283-314 while the second domain boundaries included residues from 201-282. Topology of Domain 1 is known to be of Rossmann A-B-A fold, and is a representative of the homologous superfamily of P-loop containing nucleotide triphosphate hydrolases. Homologous superfamily is indicative of a demonstrable evolutionary relationship. Domain 2 is yet to be classified in CATH.
Binding sites and active sites of proteins and DNAs are often associated with structural pockets and cavities. From CASTp server, identification and measurements of surface accessible pockets as well as interior inaccessible cavities, were obtained. A total of 19 pockets were found in tRNA-isopentenyl transferase, with pocket 19 covering the largest area and volume. Liang et al. (1998) denotes the possibility that the largest pocket/cavity is the active site, despite a number of instructive exceptions. Shape and size parameters of protein pockets and cavities are thus are important for active site analysis.
Laskowski R.A postulated that cleft volumes in proteins relate to their molecular interactions and functions. It was found that in over 83% of proteins, ligands are usually bound in the largest cleft. This suggests that size is a functional requirement. Here, structural analysis of 2qgn resulted in the finding of sulfate ion as the main ligand in the structure, which coincidently is found in the surface cleft with the largest volume, corresponding to Laskowski's finding. However, based on the function of the enzyme, there is a possibility that the clefts allow ribosome to bind and interact with this enzyme during translation.
Structural alignment enabled us to visually picture the similiarities of 2qgn with other related proteins. There are several closely related proteins,most of which are also isopentenyl transferases. As seen from DALI output, 2qgn is structurally similar to 3crr,3crq,3crm and 2ze5 to 2ze8. Nonetheless, 3crr,2 crq and 3crm are identical as stated from Profunc. This is evident also from their similar Z-scores. 3crr,3crm and 3crq are similar structures as concluded from PDBsum, the fact that they are all structures of tRNA dimethylallyltransferase. Next in line are proteins 2ze5 to 2ze8, each portraying consistent Z-scores and sequence identity. From PDBsum, 2ze5 to 2ze8 annotates the crystal structure of adenosine phosphate- isopentenyltransferase complexed with zinc ion and substrate analog. While 3crr,2crm and 3crq originates from Pseudomonas aeruginosa, 2ze5 to 2ze8 are proteins isolated from Agrobacterium tumefaciens. Proteins bearing resemblance with 2qgn were more than one. 2qor, however, upon structural alignment only partially fits with 2qgn. This protein, a guanylate kinase, has certain degree of similarity in protein folding with 2qgn, but is not structurally similar.
Further structural comparison is required in order to determine the similarity of tRNA-isopentenyl transferase with other proteins, hence enabling the deduction of other possible functions. Despite the acknowledgement that this protein belongs to a homologous superfamily of P-loop containing nucleotide triphosphate hydrolases, the exact location of the P-loop in this protein was not confirmed. Xie et al (2007), in analysing the structure of tRNA dimethylallyltransferase, a structure highly similar to our query protein, proposed that the structure of the portion homologous to the kinase forms a cleft that contains the conserved P-loop, the site of ATP binding in kinases.This suggests that the presence of the P-loop region could possibly be located in the cleft of the protein.P-loop region in the protein was found via a conserved GxTxxGK(T/S) motif, similar to the P-loop in kinasesas well as other nucleotide binding proteins
The CATH results were further supported by the DALI results, which show that adenylate kinase has structural similarities with tRNA-IPT. The former has a P-loop motif, formed between a β-strand and a α-helix, at its N-terminus. This implies that tRNA-IPT might have this motif as well. However, the pattern motif of the P-loop in the adenylate kinase is not found in tRNA-IPT. Moreover, guanylate kinase also has the P-loop but does not have the same motif pattern as that of adenylate kinase. Divergence of this motif has started then resulting in differences in the structures of both guanylate kinase and tRNA-IPT when they are superimposed (Figure 10 in Results). Thus the P-loop motif in tRNA-IPT is still to be identified but there is a high possibility that the P-motif is located near the ligand binding site.
As found from the structural analysis (PDB and Profunc), the ligand interacts with AVGKT at position 775-780 (Multiple Sequence Alignment). This is towards the end-terminus of the protein sequence and this sequence is between a β-strand and a α-helix (PDBsum). This supports the fact that the P-loop might be present in tRNA-IPT.
Cloning of the human tRNA isopentenyltransferase found a C2H2 Zn finger motif. From the article by Anna Glovko, this motif is always found in eukaryotic organisms although there are some exceptions for Arabidopsis thaliana, C.elegans and S.pombe. It is surprising to find that this motif is present as a single copy as this motif is usually interacting with more than one zinc finger. Moreover, the common role of this motif is in protein-RNA interation but this might not be the function in eukaryotes. The article suggested that the zinc finger motif may be involved in nuclear retention signal (La Casse 1995) and stability of enzyme conformation (Chong 1995).
Generally, the zinc finger motif have conserved glycine and tryptophan residues along the protein sequence together with cystein and histidine residues at extreme ends involved in coordination with zinc. The best conserved regions found to maintain the structural integrity of the protein in C2H2 zinc fingers are the conserved aliphatic and aromatic residues.
It is surprising that from LOCATE, the mouse enzyme is found in cytosol and mitochondria. According to our knowledge, tRNAs are involved in translation and this process occurs in the cytoplasm. Thus, the enzyme acting on them should be in the cytoplasm as well. However, the human enzyme is found in the nucleus. This may be because the human enzyme has a nuclear signal localization that is also found in yeast IPTs. This is not found in Bacillus halodurans as this signal peptide sequence is found in the additional residues in humans and yeast (Anna Glovko 2000). This may be the reason why human IPT is found in nucleus as well.
The DALI results obtained showed that there are several other enzymes that are structurally related to tRNA-IPT. They are isopentenyl transferase, guanylate kinase and adenylate kinase, starting from the most similar structure. According to biochemist view, similar structure proteins may contribute to similar functions as well. Thus, the functions of these enzymes are studied to further understand and support the existing function of our protein.
Using InterPro, the information on the functions of these enzymes is retrieved. Guanylate kinase catalyzes the ATP-dependent phosphorylation of GMP into GDP while adenylate kinase catalyse the Mg-dependent reversible conversion of ATP and AMP to two molecules of ADP, an essential reaction for many processes in living cells. Both of these enzymes may act on different molecules but the reactions that they catalyse involve phosphorylation.
The ability to phosphorylate is lost in isopentenyl transferases, which is also known as dimethylallyl transferase. This enzyme adds the isopentenyl group on adenine base situated at the 5'-terminal phosphate group. So far, tRNA-IPT only has the function of isopentenyl transferase and does not exhibit any phosphorylation function. However, there are still possibilities that tRNA-IPT has the ability to phosphorylate.
The presence of the P-loop motif found in humans, E.coli and S. cerevisiea (Anna Glovko 2000) might be involved in ATP/GTP binding. According to Saraste, this loop together with the zinc finger motif plays a role in ligand binding. Such functional motifs may be from the same superfamily initially but subsequently diverged away from each other, with some conserved functional regions.
Expression of tRNA-IPT is higher in several tissues but generally it is expressed in all cells. This is to ensure that efficient and correct translation takes place in each cell. This enzyme is particularly high in oocyte and adipose tissue. The former is constantly undergoing cell division and differentiation while the latter is often involved in energy production, energy storage, and hormone production. All of these processes require high levels of enzyme activity and the presence of different adaptor molecules and transcription factors.
Multiple sequence alignment(MSA) of 2qgn with other proteins, resulted in the outcome of several conserved regions shown in Figure 1. These regions include residues from (a) 750bps to 870 bps, (b) region from 880bps to 1000bps, (c) region from 1160bps to 1270bps and (d) region on the c-terminus. Combinatorial analysis of conserved regions with structural analysis unveiled that these conserved regions were located on regions surround the ligand. They may play a role in interaction with the ligand as well as formation of the active site for functional purposes. The conserved regions also annotated as P-loop motif that may well be conserved among other species as well.
Figure 2 shows vertical inheritance are dominant in the homolog genes. However, notable exception for the two species Plasmodium. Plasmodium berghei and Plasmodium yoelii are from the same branch as the bacteria, and are from the same subfamily as Staphylococcus saprophyticus and Staphylococcus aureus. Lateral gene transfer (LGT) may be suspicious occurred for Plasmodium.
The Plasmodium and Staphylococcus species have similar physical proximity shown more clearly in Figure 3. Plasmodium is best known as the etiological agent of human malaria (Perez-Tris, Hasselquist et al. 2005). Although Plasmodium parasites infect a variety of vertebrate hosts (including primates, rodents, ungulates, birds, and lizards), they rarely cause severe disease in any vertebrate hosts other than humans. The two Plasmodium species are geographically distributed in Africa and with rodent as their host (Martinsen, Perkins et al. 2008). Whereas, the two Staphylococcus species occur as a commensal on human as well as on domesticated animals such as dogs, cats, horses and chickens (Voyich, Sturdevant et al. 2008). Therefore, both species have human as their common host which reflect strong lateral signal instead of shared evolutionary history that is strongly supported in most of the tree. According to Beiko, Harlow et el. 2005, transfer between distantly related taxa are less likely but not impossible, this is via illegitimate integration of foreign DNA into the genome rather than typical homologous recombination. If the reason behind the abnormal grouping of Plasomdium is the occurrence of LGT among distantly related organisms, that is for instance niche invasion process may be greatly accelerated and genes that confer advantages against host defence mechanisms may be widely shared among pathogens. In spite of that, our gene is found to be conceivably responsible in maintaining stability of translation (more details on possible function will be discussed below). Since this gene is suspicous to be an informational gene which is less subject to LGT than are operational genes within the euryarchaeotes (Nesbo, Boucher et al. 2001). Additionally, more evidence to reject the hypothesis is that the branch for Plasmodium has very low bootstrap value (shown in Figure 3) which suggest that grouping could be erroneous. Although Plasmodium share common genes with bacteria, advance research on the protein structure must be undertaken in order to support the claim. Therefore, there is insignificant evidence to prove that lateral gene transfer has occurred in Plasmodium. To conclude, the phylogenetic tree built incited the possibility that this protein showed a high level of conservation from Bacteria to Eukaryotes. Despite constant evolution, presence of this protein conserved among a wide range of species clearly implies the significance of this protein.
ClustalX is limited to align simple organism. In this research there are numerous genes, and multiple sequence alignment cannot be performed by ClustalX. Hence, this may lead to inaccurate tree built in Figure 2 due to inappropriate subjective choice of representatives. A better implemented state-of-art version instead of ClustalX is FUGUE. It is a search engine designed to align multiple sequences and multiple structures to enrich the conservation or variation information to determine the evolutionary homology. This algorithm combines information from both multiple sequences and multiple structures, which indeed improves both the homologues recognition performance and alignment quality. The protein structure is presumably to be more informative and more preserved during evolution than the sequence regarding the biological function. Hence, FUGUE is a better option for homology detection and comparative modelling (Shi, Blundell et al. 2001).