Results 5: Difference between revisions

From MDWiki
Jump to navigationJump to search
Line 30: Line 30:
==SymAtlas==
==SymAtlas==


'''SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.'''
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.


From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.


[[Image:HumanSymatlas.png]]
[[Image:HumanSymatlas.png]]
 
Figure 1:


Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues  
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues  
Line 41: Line 41:


[[Image:MouseSymatlas.png]]
[[Image:MouseSymatlas.png]]
Figure 2:


==ProKnow==
==ProKnow==

Revision as of 21:59, 11 June 2007

FUNCTIONAL ANALYSIS

ProFunc

From a summary of predicted function from our ProFunc results we can see that the most commonly occurring name term is GTPase, followed by GTP binding.

Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.

UniProt

During the ProFunc analysis UniProt was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins.

Aligned sequences:
           %-tage
Seq. id.  identity    Name
--------  --------    ----
Query      -     - Query sequence 
O31743    95.6   - O31743 YlqF protein. 
Q65JP4    84.6   - Q65JP4 YlqF (GTP-binding domain protein). 
Q845L2    70.1   - Q845L2 Hypothetical protein. 
Q9Z9S1    59.9   - Q9Z9S1 YlqF (BH2476 protein). 
Q4MFY1    65.9   - Q4MFY1 Hypothetical protein. 
Q5WFP0    53.5   - Q5WFP0 GTPase. 
Q5HPU7    50.9   - Q5HPU7 GTP-binding protein, putative. 
Q3XZB6    52.4   - Q3XZB6 GTP-binding protein. 
A0Q0X8    49.3   - A0Q0X8 GTP-binding protein, putative. 
Q03KX8    47.6   - Q03KX8 Predicted GTPase.

SymAtlas

SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.

From the distribution of the human ortholog protein in the human body it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.

HumanSymatlas.png Figure 1:

Though the distribution of the mouse ortholog protein is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues


MouseSymatlas.png Figure 2:

ProKnow

The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.

GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description 
                        Molecular Function
0005525	    0.9815	     2.8	     6	     GTP binding
0000166	    0.0101	     2.9	     6	     nucleotide binding
0008168	    0.0068	     2.9	     6	     methyltransferase activity
0003924	    0.0016	     2.5	     6	     GTPase activity
                        Biological Process
0007264	    1.0000	     2.8	     6	     small GTPase mediated signal transduction


STRUCTURE

Quality of YlqF protein model and overall structure

The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.


Table 1. Crystal parameters and refinement statistics

Parameters Resolution[Å]

2.00

R factor, %

21.6

Free R factor, %

25.0

Space Group

P 21 21 21

Unit Cell Length[Å]

Angles [°]

a

alpha

36.75

90.00

b

beta

68.57

90.00

c

gamma

105.57

90.00

  1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA
 61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM

Figure 1. Amino acid sequence of YlqF

Cartoon.jpg

Figure 2. Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)

Ligand.jpg

Figure 3. Structure of YlqF showing ligands. The image was obtained from PDB.


The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.


Secondary structure.png

Pi helix.gif = pi helix, 310helix.jpg = 310 helix, Sheet.jpg = extended strand, Turn.jpg = turn, Alpha.jpg = alpha helix,

Greyed out residues have no structural information

Figure 4. Sequence and Secondary Structure

Domain1.jpg

Figure 5. Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)


Structure Analysis

Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase.

Class

Alpha and beta proteins

Fold

P-loop containing

nucleoside triphosphate hydrolases

Superfamily

P-loop containing

nucleoside triphosphate hydrolases

Family

G proteins

Domain

Probable GTPase YlqF

Species

Bacillus subtilis

The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.


 No Chain  raw-score Z-score %id lali rmsd  Description
 1 1pujA     3946.0    42.7 100  261  0.0  CONSERVED HYPOTHETICAL PROTEIN YLQF                                             
 2 1udxA      618.8     6.2  14  141 25.5  THE GTP-BINDING PROTEIN OBG                                                     
 3 1uadA      541.8     6.1   9   99  2.6  RAS-RELATED PROTEIN RAL-A                                                       
 4 1zbdA      553.7     6.1  12   99  2.8  RAB-3A                                                                          
 5 1g17A      526.0     5.9  18   95  2.5  RAS-RELATED PROTEIN SEC4                                                        
 6 1c1yA      515.7     5.8  12   96  2.8  RAS-RELATED PROTEIN RAP-1A                                                      
 7 1kjzA      571.3     5.7  14  114  4.0  EIF2GAMMA                                                                       
 8 1oixA      510.9     5.4  11   94  2.6  RAS-RELATED PROTEIN RAB-11A                                                     
 9 1ukvY      519.2     5.3  10   99  2.8  SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR                                    
10 1ijfA      535.8     5.3  13  103  3.6  ELONGATION FACTOR 1-ALPHA                                                       
11 1gpmA      481.7     5.2   7  117  4.0  GMP SYNTHETASE                                                                  
12 1hrkA      483.7     5.1   5  116  4.2  FERROCHELATASE                                                                  
13 1d5cA      474.4     5.1  10   92  2.6  RAB6 GTPASE                                                                     
14 1ni5A      512.4     5.0   6  115  5.5  PUTATIVE CELL CYCLE PROTEIN MESJ                                                
15 3rapS      484.1     5.0   9   92  2.6  G PROTEIN RAP2A                                                                 
16 1gwnA      475.5     5.0  10   94  2.7  RHO-RELATED GTP-BINDING PROTEIN RHOE                                            
17 1vg8A      479.0     5.0  11  106  3.9  RAS-RELATED PROTEIN RAB-7                                                       
18 1cqxA      438.8     4.9   4  116  4.2  FLAVOHEMOPROTEIN                                                                
19 1mkyA      460.7     4.7  11  100  8.8  PROBABLE GTP-BINDING PROTEIN ENGA                                               
20 1fzqA      463.8     4.7  14   92  2.6  ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3

Table . Dali search results of PDB/chain identifiers and structural alignment statistics

InterProScan Results

InterPro: IPR005289 MG442 This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.

InterPro: IPR002917 MMR_HSR1 HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.

PYLOGENY

The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (Bacillus subtilis) is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between.

NKxD5.jpg

Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).

WalkerAB3.jpg

Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).



The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.

TREEcircle.jpg

Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.