Materials and Methods - 2qgnA: Difference between revisions

From MDWiki
Jump to navigationJump to search
 
(18 intermediate revisions by 3 users not shown)
Line 1: Line 1:
==Evolution==
==Determination of Evolution==
1. The human amino acid sequence fasta file of “pdb: 2qgnA tRNA isopentenyltransferase 1” was retrieved from http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&val=31581534. This was the query sequence and was saved as “sequencesh.fasta”
# The human amino acid sequence fasta format file of “2qgnA tRNA isopentenyltransferase 1” was retrieved from RCSB Protein Database. This is act as the query sequence.http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&val=31581534.
# Conduct BLASTp search against the NCBI non-redundant protein database Version 2.2.10. This database was downloaded on the Oct-19-2004.
# Extract out the accession numbers only from the homologous genes.
# Obtain Fasta Format of selection of all sequences found with blast that have extremely low E value.
# Click ALIGNMENT/DO COMPLETE ALIGNMENT to generate multiple alignment using ClustalX.  Click Draw NJ tree to draw neighbour-joining trees. And draw a tree, follow by a bootstrap NJ tree with 100 bootstrap trials
# Use the "Kenegdo Webserver" that converts sequence identifiers into species names.
# View it in FigTree and take screen snapshot of the tree.


2. Conduct BLASTp search against the NCBI non-redundant protein database Version 2.2.10 and save it as “seqh.blast”. This database was downloaded on the Oct-19-2004.
''(see [[Appendix - 2qgnA ]] for more detailed descriptions.)''


Blast sequencesh.fasta query sequence with the non-redundant protein database
Basic Local Alignment Search Tool (BLAST) enables a researcher to compare a query sequence with a library or database of sequences and identify library sequences that resemble the query sequence above a certain threshold (Altschul, Gish et al. 1990).  
D:\blast\blastall -p blastp -d D:\blast\databases\nr -i sequencesh.fasta -o seqh.blast


3. copy all the accession number, annotation and so on except the actual sequence, save it as “seqh1.blast”
In order to show the evolutionary relationship between species, neighbour-joining phylogenetic tree were constructed by ClustalX. This algorithm first did a pairwise alignment of all pairs to determine sequence similarity and then define an order of addition of sequences to alignments bases on similarity and finally generate multiple alignments progressively based on defined order (Kohli and Bachhawat 2003).  


4. By using excel, extract out the accession numbers of “seqh1.blast”
A bootstrap tree with bootstrap value of 100 was be drawn based on the neighbour-joining tree. Bootstrap estimation is formulated as a two-step sampling procedure: (1) sampling of sequences from the evolutionary process and (2) resampling of the original sequence sample. The probability that a bootstrap resampling of an original sequence sample will support the true tree is found to depend on the model tree, the sequence length, and the probability that a randomly chosen nucleotide site is an informative site (Zharkikh and Li 1992). A tree figure drawing tool FigTree was used (Rambaut 2006).  All bootstrap trees were demonstrated in a circular layout.


5. since 25 of the 500 matches has E-value of zero which means they are not significant, all others have very small E-values. Some of the similar sequences with nearly identical annotation will be drop out to ease alignment. Saved as “seqh2.xls”
==Determination of Structure==
 
A protein search on PDB and NCBI Entrez was performed in order to obtain the sequence of the given protein. Data of given protein was retrieved and then uploaded into PyMOL software to construct the 3-D structure as well as enable further analysis. Next, proteins with similar structures were identified using the DALI server. Results from the server generated an outcome of other structural neighbours which confer a considerable degree of similarity in structure. By comparing the 3D structures, biologically interesting similarities that may not have detectable by sequence alignment are exposed. A CE comparison was performed among the identified structures as it uses databases and tools for 3D Protein structure comparison and alignment. It calculates structural alignment for two chains. Surface properties of the given proteins were analysed using Pymol. Profunc, PDBsum and InterPro were also utilised to identify the ligands and clefts on the protein. CATH domain database enabled the search of domain boundaries and visualization to be done through PyMOL.
6. another version is “seqh3.blast” (this file only contains the line-by-line listing of the "accession numbers" from the “seqh2.blast”)
 
7. Obtain Fasta Format file of the sequences found with blast and save it as “newseqh.fasta”
D:\blast\fastacmd -d D:\blast\databases\nr -i seqh3.blast -o H:\newseqh.fasta
 
8. Construct a multiple sequence alignment and a bootstrap tree on selected homology sequences.


9. Click ALIGNMENT/DO COMPLETE ALIGNMENT to generate multiple alignment using ClustalX. 
==Determination of Function==
Output Guid Tree: newseq.dnd     
Expression data of the enzyme tRNA-IPT in mouse and human were obtained from the SymAtlas.
Output Alignment Files: newseq.aln
The subcellular localisation of the enzyme was found using LOCATE. The molecular function and biological processes were obtained from ProKnow.  
 
The zinc finger motif was identified in the human sequence using a motif finder. It is also found in other organisms such as S. cerevisiea, C. elegans and S. pombe (Anna Golovko, 2000) by aligning their sequences together with other known proteins containing zinc finger. Protein interaction with other protein was identified using STRING, with the occurence of the protein across different species analysed. Literature search on this enzyme also helped to determine the function of the enzyme.
10. Click Draw NJ tree to draw neighbour-joining trees.And draw a tree, follow by a bootstrap NJ tree with 100 bootstrap trials save it as newseqh.ph and newseqh.phb respectively.
 
11. use Webserver that converts sequence identifiers into species names. http://foo.maths.uq.edu.au/~huber/BIOL3004/gi2name.pl
 
12. Load up the newseqh.fasta file with newseqh.phb
 
Save the output as newseqh_names.phb
 
View it in FigTree and take screen snapshot of the tree.
 
==Determination of Structure==
A protein search on PDB and NCBI Entrez was performed in order to obtain the sequence of the given protein. Data of given protein was retrieved and then uploaded into PyMOL software to construct the 3-D structure as well as enable further analysis. Next, proteins with similar structures were identified using the DALI server. Results from the server generated an outcome of other structural neighbours which confer a considerable degree of similarity in structure. By comparing the 3D structures, biologically interesting similarities that may not have detectable by sequence alignment are exposed. A CE comparison was performed among the identified structures as it uses databases and tools for 3D Protein structure comparison and alignment. It calculates structural alignment for two chains. Surface properties of the given proteins were analysed using Pymol. Profunc, PDBsum and InterPro were also utilised to identify the domains, ligands and clefts on the protein.

Latest revision as of 16:42, 9 June 2008

Determination of Evolution

  1. The human amino acid sequence fasta format file of “2qgnA tRNA isopentenyltransferase 1” was retrieved from RCSB Protein Database. This is act as the query sequence.http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&val=31581534.
  2. Conduct BLASTp search against the NCBI non-redundant protein database Version 2.2.10. This database was downloaded on the Oct-19-2004.
  3. Extract out the accession numbers only from the homologous genes.
  4. Obtain Fasta Format of selection of all sequences found with blast that have extremely low E value.
  5. Click ALIGNMENT/DO COMPLETE ALIGNMENT to generate multiple alignment using ClustalX. Click Draw NJ tree to draw neighbour-joining trees. And draw a tree, follow by a bootstrap NJ tree with 100 bootstrap trials
  6. Use the "Kenegdo Webserver" that converts sequence identifiers into species names.
  7. View it in FigTree and take screen snapshot of the tree.

(see Appendix - 2qgnA for more detailed descriptions.)

Basic Local Alignment Search Tool (BLAST) enables a researcher to compare a query sequence with a library or database of sequences and identify library sequences that resemble the query sequence above a certain threshold (Altschul, Gish et al. 1990).

In order to show the evolutionary relationship between species, neighbour-joining phylogenetic tree were constructed by ClustalX. This algorithm first did a pairwise alignment of all pairs to determine sequence similarity and then define an order of addition of sequences to alignments bases on similarity and finally generate multiple alignments progressively based on defined order (Kohli and Bachhawat 2003).

A bootstrap tree with bootstrap value of 100 was be drawn based on the neighbour-joining tree. Bootstrap estimation is formulated as a two-step sampling procedure: (1) sampling of sequences from the evolutionary process and (2) resampling of the original sequence sample. The probability that a bootstrap resampling of an original sequence sample will support the true tree is found to depend on the model tree, the sequence length, and the probability that a randomly chosen nucleotide site is an informative site (Zharkikh and Li 1992). A tree figure drawing tool FigTree was used (Rambaut 2006). All bootstrap trees were demonstrated in a circular layout.

Determination of Structure

A protein search on PDB and NCBI Entrez was performed in order to obtain the sequence of the given protein. Data of given protein was retrieved and then uploaded into PyMOL software to construct the 3-D structure as well as enable further analysis. Next, proteins with similar structures were identified using the DALI server. Results from the server generated an outcome of other structural neighbours which confer a considerable degree of similarity in structure. By comparing the 3D structures, biologically interesting similarities that may not have detectable by sequence alignment are exposed. A CE comparison was performed among the identified structures as it uses databases and tools for 3D Protein structure comparison and alignment. It calculates structural alignment for two chains. Surface properties of the given proteins were analysed using Pymol. Profunc, PDBsum and InterPro were also utilised to identify the ligands and clefts on the protein. CATH domain database enabled the search of domain boundaries and visualization to be done through PyMOL.

Determination of Function

Expression data of the enzyme tRNA-IPT in mouse and human were obtained from the SymAtlas. The subcellular localisation of the enzyme was found using LOCATE. The molecular function and biological processes were obtained from ProKnow. The zinc finger motif was identified in the human sequence using a motif finder. It is also found in other organisms such as S. cerevisiea, C. elegans and S. pombe (Anna Golovko, 2000) by aligning their sequences together with other known proteins containing zinc finger. Protein interaction with other protein was identified using STRING, with the occurence of the protein across different species analysed. Literature search on this enzyme also helped to determine the function of the enzyme.