Materials and Methods - 2qgnA

From MDWiki
Revision as of 21:42, 8 June 2008 by Emilyntan (talk | contribs)
Jump to navigationJump to search

Determination of Structure

A protein search on PDB and NCBI Entrez was performed in order to obtain the sequence of the given protein. Data of given protein was retrieved and then uploaded into PyMOL software to construct the 3-D structure as well as enable further analysis. Next, proteins with similar structures were identified using the DALI server. Results from the server generated an outcome of other structural neighbours which confer a considerable degree of similarity in structure. By comparing the 3D structures, biologically interesting similarities that may not have detectable by sequence alignment are exposed. A CE comparison was performed among the identified structures as it uses databases and tools for 3D Protein structure comparison and alignment. It calculates structural alignment for two chains. Surface properties of the given proteins were analysed using Pymol. Profunc, PDBsum and InterPro were also utilised to identify the ligands and clefts on the protein. CATH domain database enabled the search of domain boundaries and visualization to be done through PyMOL.

Determination of Function

Expression data of the enzyme tRNA-IPT in mouse and human were obtained from the SymAtlas. The subcellular localisation of the enzyme was found using LOCATE. The zinc finger motif was identified in the human sequence using a motif finder. It is also found in other organisms such as S. cerevisiea, C. elegans and S. pombe (Anna Golovko, 2000) by aligning their sequences together with other known proteins containing zinc finger. Protein interaction with other protein was identified using STRING, with the occurence of the protein across different species analysed. Literature search on this enzyme also helped to determine the function of the enzyme.

Determination of Evolution

1. The human amino acid sequence fasta file of “pdb: 2qgnA tRNA isopentenyltransferase 1” was retrieved from http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&val=31581534. This was the query sequence and was saved as “sequencesh.fasta”

2. Conduct BLASTp search against the NCBI non-redundant protein database Version 2.2.10 and save it as “seqh.blast”. This database was downloaded on the Oct-19-2004.

Blast sequencesh.fasta query sequence with the non-redundant protein database

D:\blast\blastall -p blastp -d D:\blast\databases\nr -i sequencesh.fasta -o seqh.blast 

3. copy all the accession number, annotation and so on except the actual sequence, save it as “seqh1.blast”

4. By using excel, extract out the accession numbers of “seqh1.blast”

5. since 25 of the 500 matches has E-value of zero which means they are not significant, all others have very small E-values. Some of the similar sequences with nearly identical annotation will be drop out to ease alignment. Saved as “seqh2.xls”

6. another version is “seqh3.blast” (this file only contains the line-by-line listing of the "accession numbers" from the “seqh2.blast”)

7. Obtain Fasta Format file of the sequences found with blast and save it as “newseqh.fasta”

D:\blast\fastacmd -d D:\blast\databases\nr -i seqh3.blast -o H:\newseqh.fasta 

8. Construct a multiple sequence alignment and a bootstrap tree on selected homology sequences.

9. Click ALIGNMENT/DO COMPLETE ALIGNMENT to generate multiple alignment using ClustalX.

Output Guid Tree: newseq.dnd       
Output Alignment Files: newseq.aln 

10. Click Draw NJ tree to draw neighbour-joining trees.And draw a tree, follow by a bootstrap NJ tree with 100 bootstrap trials save it as newseqh.ph and newseqh.phb respectively.

11. use Webserver that converts sequence identifiers into species names. http://foo.maths.uq.edu.au/~huber/BIOL3004/gi2name.pl

12. Load up the newseqh.fasta file with newseqh.phb

Save the output as newseqh_names.phb

View it in FigTree and take screen snapshot of the tree.