Results - 2gqnA
Multiple Sequence Alignment
Majority of the blast search results have significant match (extremely low E value), except 25 out of the 500 matches have E-value of zero which means 25 of them are not significant and will be ignored. Some of the similar sequences with nearly identical annotation will be drop out to ease alignment.
Due to the fact that the human sequence contains eukaryotes as well as many other organisms like plants and microorganisms so the bacteria sequence will not be necessary to be considered at this stage. I have taken 55 matches from the human sequence homolog with extremely low E value. The multiple sequence alignment and a bootstrap tree was constructed
The sequence CDLCDRIIIGDREWAAHIKSKSH shown in Figure 1 (D) is deemed to be zinc finger (further discussion will be detailed below) are only found in human sequence and not in bacteria. Moreover, it is found towards the c-terminus and probably truncated in the bacteria sequence. This is the reason why the particular region is not conserved in the multiple sequence alignment.
Although there are a couple of branches with asterisks, the phylogenetic tree reflects that our protein sequence (tRNA isopentenyl transferase 1) are found across many types of species and consistent with tradition taxonomic groupings (shown in Figure 2). However, notable exception with plasmodium which is obligate eukaryotic parasites. The close homologues are detected in different life domains (fungi, green plant, worms, unicellular organisms, bacteria and even in some higher eukaryote), indicating that the source of our genes may have been outside the Bacteria clade. The homologous sequences contains many different phylum of bacteria, they are Planctomycetes, Proteobacteria, Actinobacteria, Chloroflexi, Proteobacteria, cyanobacteria, Aquificae Bacteria and Firmicutes Bacteria. The higher eukaryote organisms include human, mouse, cow, fly, Platypus , frog, fish, honeybee and bird.
In Figure 3 Plasmodium berghei and Plasmodium yoelii are branched within the bacteria species, one possible reason may be lateral gene transfer has occurred for plasmodium so there is a mix up for it being consider as bacteria instead of in the eukaryote branch. This is a remarkable outcome in this research, advance genome analysis will be required for to determine the possible function for this protein.
Treeview and multiview
Structure of tRNA isopentenyltransferase
Protein Sequence in FASTA format
>gi|152149497|pdb|2QGN|A Chain A, Crystal Structure Of Trna Isopentenylpyrophosphate Transferase (Bh2366) From Bacillus Halodurans, Northeast Structural Genomics Consortium Target Bhr41. XKEKLVAIVGPTAVGKTKTSVXLAKRLNGEVISGDSXQVYRGXDIGTAKITAEEXDGVPHHLIDIKDPSE SFSVADFQDLATPLITEIHERGRLPFLVGGTGLYVNAVIHQFNLGDIRADEDYRHELEAFVNSYGVQALH DKLSKIDPKAAAAIHPNNYRRVIRALEIIKLTGKTVTEQARHEEETPSPYNLVXIGLTXERDVLYDRINR RVDQXVEEGLIDEAKKLYDRGIRDCQSVQAIGYKEXYDYLDGNVTLEEAIDTLKRNSRRYAKRQLTWFRN KANVTWFDXTDVDFDKKIXEIHNFIAGKLEEKSKLEHHHHHH
Analysis of the secondary structure acquired from Protein Data Bank showed results as displayed below :
Surface Structure of 2qgn
Electrostatic Surface Potential
2qgnA is composed of two main domains. CATH analysis of 2qgn resulted in the finding of two main domains composing 2qgnA.
Domain 1 ranges from residue 2-200 and residue 283-314. Domain 2 encompasses residues stretching from 201-282.
Ligand Binding Sites and Surface Clefts
Hydrophillic binding sites
Bridged-H-bond binding sites
Hydrophobic binding sites
Conserved residues for tRNA isopentenyl transferase from Clustal alignment
Multiple sequence alignment from ClustalX allowed conserved regions in 2qgn and related species to be found.
PDB entry code for 2qgn was loaded onto DALI server to search for structurally similar neighbours. Displayed below are the results from DALI search :-
DALI output describes the following :
Z score , the statistical significance of the similarity between protein-of-interest and other neighbourhood protines. The program optimises a weighted sum of similarities of intramolecular distances.
Root Mean Square Distance (RMSD), root-mean-square deviation of C-alpha atoms in the least-squares superimposition of the structurally equivalent C-alpha atoms. As in indicated in DALI, rmsd is not optimised and is only reported for information.
lali, the number of structurally equivalent residues.
nres, or the total number of amino acids in the hit protein.
%id - percentage of identical amino acids over structurally equivalent residues.
A total of 527 hits were found from DALI search, nonetheless only the first 20 hits that may be of significance were shown on the figure.
Related protein sequences
Proteins with similar fold retrived from SSM (Secondary Structure Matching)
From Profunc, similarities of related proteins and proteins with similar fold to query protein were compared with results from DALI. 2qgnA is the query protein highlighted in black in all tables. 2crm, 2crr and 2crq were both found in DALI and Profunc(highlighted in red). On the other hand, 2ze5,2ze6,2ze7 and 2ze8, as well as 3adk and 2qor were also found in DALI output(highlighted in blue).
Based on the outcome of DALI and Profunc, PDB files of each structurally similar protein was obtained from PDB. These were each superimposed against 2qgn using the PyMOL software, to compare the structural similiarity. Results are as below :
As indicated by the figures above, each structures were structurally similar to 2qgn, suggesting that they could have functionally similar properties. Nonetheless,notice that 2-qor is only partially similar to 2qgn structure.
As the Z-score decreases for the DALI output, the structural similarity decreases as well. For this reason, functional analysis of 2qgn was only done for DALI outputs with lali scores higher than 200.
Localisation Expression of tRNA isopentenyltransferase
Generally, this enzyme is expressed in all tissue types since it is important that functional protein are synthesized in each of these tissues. Specifically, it is highly expressed in adipose tissues as well as oocytes. Relatively high amounts of this enzyme is expressed in prostate, adrenal gland, B-cells and trachea. The reason why tRNA-IPT are at higher concentrations in these tissues may reflect higher levels of protein synthesis.
Both the molecular function and biological proceses are obtained from ProKnow.
Annotations for tRNA isopentenyltransferase
The annotations below shows the cellular, biological processes and functional function of the protein in plants. tRNA-IPT was first found in plants and it is a very important hormaone enzyme that affects plant growth and development.