Appendix - 2qgnA: Difference between revisions

From MDWiki
Jump to navigationJump to search
No edit summary
No edit summary
Line 3: Line 3:
2. Conduct BLASTp search against the NCBI non-redundant protein database Version 2.2.10 and save it as “seqh.blast”. This database was downloaded on the Oct-19-2004.
2. Conduct BLASTp search against the NCBI non-redundant protein database Version 2.2.10 and save it as “seqh.blast”. This database was downloaded on the Oct-19-2004.


Blast sequencesh.fasta query sequence with the non-redundant protein database
'''Blast sequencesh.fasta query sequence with the non-redundant protein database'''
  D:\blast\blastall -p blastp -d D:\blast\databases\nr -i sequencesh.fasta -o seqh.blast  
  D:\blast\blastall -p blastp -d D:\blast\databases\nr -i sequencesh.fasta -o seqh.blast  


Line 28: Line 28:


12. View it in FigTree and take screen snapshot of the tree.
12. View it in FigTree and take screen snapshot of the tree.
# use Webserver that converts sequence identifiers into species names.  Load up the newseqh.fasta file with newseqh.phb and save the output as newseqh_names.phb. http://foo.maths.uq.edu.au/~huber/BIOL3004/gi2name.pl
# View it in FigTree and take screen snapshot of the tree.

Revision as of 10:10, 9 June 2008

1. The human amino acid sequence fasta file of “pdb: 2qgnA tRNA isopentenyltransferase 1” was retrieved from http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&val=31581534. This was the query sequence and was saved as “sequencesh.fasta”

2. Conduct BLASTp search against the NCBI non-redundant protein database Version 2.2.10 and save it as “seqh.blast”. This database was downloaded on the Oct-19-2004.

Blast sequencesh.fasta query sequence with the non-redundant protein database

D:\blast\blastall -p blastp -d D:\blast\databases\nr -i sequencesh.fasta -o seqh.blast 

3. copy all the accession number, annotation and so on except the actual sequence, save it as “seqh1.blast”

4. By using excel, extract out the accession numbers of “seqh1.blast”

5. since 25 of the 500 matches has E-value of zero which means they are not significant, all others have very small E-values. Some of the similar sequences with nearly identical annotation will be drop out to ease alignment. Saved as “seqh2.xls”

6. another version is “seqh3.blast” (this file only contains the line-by-line listing of the "accession numbers" from the “seqh2.blast”)

7. Obtain Fasta Format file of the sequences found with blast and save it as “newseqh.fasta”

D:\blast\fastacmd -d D:\blast\databases\nr -i seqh3.blast -o H:\newseqh.fasta 

8. Construct a multiple sequence alignment and a bootstrap tree on selected homology sequences.

9. Click ALIGNMENT/DO COMPLETE ALIGNMENT to generate multiple alignment using ClustalX.

Output Guid Tree: newseq.dnd       
Output Alignment Files: newseq.aln 

10. Click Draw NJ tree to draw neighbour-joining trees.And draw a tree, follow by a bootstrap NJ tree with 100 bootstrap trials save it as newseqh.ph and newseqh.phb respectively.

11. use Webserver that converts sequence identifiers into species names. Load up the newseqh.fasta file with newseqh.phb and save the output as newseqh_names.phb. http://foo.maths.uq.edu.au/~huber/BIOL3004/gi2name.pl

12. View it in FigTree and take screen snapshot of the tree.


  1. use Webserver that converts sequence identifiers into species names. Load up the newseqh.fasta file with newseqh.phb and save the output as newseqh_names.phb. http://foo.maths.uq.edu.au/~huber/BIOL3004/gi2name.pl
  1. View it in FigTree and take screen snapshot of the tree.