Materials and Methods 5: Difference between revisions

From MDWiki
Jump to navigationJump to search
No edit summary
No edit summary
 
Line 1: Line 1:
=Phylogeny=
=Phylogeny=


1PUJA FASTA sequence was obtained from NCBI Entrez Protein.  This was used as a query to BLASTP the non-redundant protein database.  Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic) (Hedges, S. 2002).  MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif.  The phylogenetic tree was constructed using the PROT-DIST (dayoff PAM distance matrix, 100 datasets, version 3.63), NEIGHBOUR (neighbour-joining method of tree construction, 100 datasets), and CONSENSE (default settings) programs of the PHYLIP package.  Confidence in tree branches was determined by bootstrapping using SEQBOOT (100 resamplings of the PHYLIP analysis, default settings).
1PUJA FASTA sequence was obtained from NCBI Entrez Protein.  This was used as a query to BLASTP the non-redundant protein database.  Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic).  MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif.  The phylogenetic tree was constructed using the PROT-DIST (dayoff PAM distance matrix, 100 datasets, version 3.63), NEIGHBOUR (neighbour-joining method of tree construction, 100 datasets), and CONSENSE (default settings) programs of the PHYLIP package.  Confidence in tree branches was determined by bootstrapping using SEQBOOT (100 resamplings of the PHYLIP analysis, default settings).





Latest revision as of 01:25, 12 June 2007

Phylogeny

1PUJA FASTA sequence was obtained from NCBI Entrez Protein. This was used as a query to BLASTP the non-redundant protein database. Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic). MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif. The phylogenetic tree was constructed using the PROT-DIST (dayoff PAM distance matrix, 100 datasets, version 3.63), NEIGHBOUR (neighbour-joining method of tree construction, 100 datasets), and CONSENSE (default settings) programs of the PHYLIP package. Confidence in tree branches was determined by bootstrapping using SEQBOOT (100 resamplings of the PHYLIP analysis, default settings).


Structure

In order to determine the structure of YlqF protein various methods including several computational tools. The structure was first compared to all known structures. Proteins with similar structure to YlqF were determined by using Dali. Dali include PDB identification of similar structure, z score, RMSD, number of equivalent residues and so on. Z score is used to sort a list. The results of Dali was e-mailed. After determining similar structures, protein architectural classification was conducted using SCOP and CATH databases. SCOP database provided with a description of the structural and evolutionary relationships of the proteins of known structure. InterPro was used for determination of the protein domains. Pfam was also used to determine the protein family to which YlqF protein belongs to.With the knowledge from above research, the structure of YlqF was made using PyMol.


Function

To determine the function of 1pujA (and related YlqF proteins) a variety of data-mining strategies and computational tools were utilized. First a literature search was conducted using Google Scholar and PubMed to familiarize ourselves with YlqF proteins and to find any knowledge or additional information already in databases and published literature. From this point on we predicted a possible function for our protein and using this proceeded to test our hypothesis. This was done using the ProKnow and the Profunc web servers. ProKnow is a relatively simple database which determines the likely molecular function and biological process of a queried sequence. ProFunc is a server which was developed to help identify the likely function of a protein from its three-dimensional structure. It uses both sequence and structure based methods to try to provide evidence as to a protein's likely or possible function (REF= http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/). UniProt is a tool for performing a multiple sequence alignment on a submitted sequence. It was also important to investigate the function of any suspected orthologs or homologs of our protein. Mouse and human sequences were analysed with a server called SymAtlas, useful for determining the expression levels of a sequence in different tissues.