Results 5

From MDWiki
Jump to navigationJump to search

STRUCTURE

Quality of YlqF protein model and overall structure

The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figure 1). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.


Table 1. Crystal parameters and refinement statistics

Parameters Resolution[Å]

2.00

R factor, %

21.6

Free R factor, %

25.0

Space Group

P 21 21 21

Unit Cell Length[Å]

Angles [°]

a

alpha

36.75

90.00

b

beta

68.57

90.00

c

gamma

105.57

90.00

  1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA
 61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM

Figure 1. Amino acid sequence of YlqF


The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 2). YlqF protein consists of two domains. One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.


Secondary structure.png

Pi helix.gif = pi helix, 310helix.jpg = 310 helix, Sheet.jpg = extended strand, Turn.jpg = turn, Alpha.jpg = alpha helix,

Greyed out residues have no structural information

Figure 2. Sequence and Secondary Structure


Structure Analysis

Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase.

Class

Alpha and beta proteins

Fold

P-loop containing

nucleoside triphosphate hydrolases

Superfamily

P-loop containing

nucleoside triphosphate hydrolases

Family

G proteins

Domain

Probable GTPase YlqF

Species

Bacillus subtilis

 No Chain  raw-score Z-score %id lali rmsd  Description
 1 1pujA     3946.0    42.7 100  261  0.0  CONSERVED HYPOTHETICAL PROTEIN YLQF                                             
 2 1udxA      618.8     6.2  14  141 25.5  THE GTP-BINDING PROTEIN OBG                                                     
 3 1uadA      541.8     6.1   9   99  2.6  RAS-RELATED PROTEIN RAL-A                                                       
 4 1zbdA      553.7     6.1  12   99  2.8  RAB-3A                                                                          
 5 1g17A      526.0     5.9  18   95  2.5  RAS-RELATED PROTEIN SEC4                                                        
 6 1c1yA      515.7     5.8  12   96  2.8  RAS-RELATED PROTEIN RAP-1A                                                      
 7 1kjzA      571.3     5.7  14  114  4.0  EIF2GAMMA                                                                       
 8 1oixA      510.9     5.4  11   94  2.6  RAS-RELATED PROTEIN RAB-11A                                                     
 9 1ukvY      519.2     5.3  10   99  2.8  SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR                                    
10 1ijfA      535.8     5.3  13  103  3.6  ELONGATION FACTOR 1-ALPHA                                                       
11 1gpmA      481.7     5.2   7  117  4.0  GMP SYNTHETASE                                                                  
12 1hrkA      483.7     5.1   5  116  4.2  FERROCHELATASE                                                                  
13 1d5cA      474.4     5.1  10   92  2.6  RAB6 GTPASE                                                                     
14 1ni5A      512.4     5.0   6  115  5.5  PUTATIVE CELL CYCLE PROTEIN MESJ                                                
15 3rapS      484.1     5.0   9   92  2.6  G PROTEIN RAP2A                                                                 
16 1gwnA      475.5     5.0  10   94  2.7  RHO-RELATED GTP-BINDING PROTEIN RHOE                                            
17 1vg8A      479.0     5.0  11  106  3.9  RAS-RELATED PROTEIN RAB-7                                                       
18 1cqxA      438.8     4.9   4  116  4.2  FLAVOHEMOPROTEIN                                                                
19 1mkyA      460.7     4.7  11  100  8.8  PROBABLE GTP-BINDING PROTEIN ENGA                                               
20 1fzqA      463.8     4.7  14   92  2.6  ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3

Table . Dali search results of PDB/chain identifiers and structural alignment statistics

PYLOGENY

The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (Bacillus subtilis) is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between.

NKxD5.jpg

Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).

WalkerAB3.jpg

Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).



The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.

TREEcircle.jpg

Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.

FUNCTIONAL ANALYSIS

ProFunc

From a summary of predicted function from our ProFunc results we can see that the most commonly occurring name term is GTPase, followed by GTP binding.

Of the results returned by ProFunc, Interpro, PDB, SSM and DALI and 3D functional template searches wew all significant however the most promising results came from UniProt.

UniProt

During the ProFunc analysis UniProt was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins.

Aligned sequences:
           %-tage
Seq. id.  identity    Name
--------  --------    ----
Query      -     - Query sequence 
O31743    95.6   - O31743 YlqF protein. 
Q65JP4    84.6   - Q65JP4 YlqF (GTP-binding domain protein). 
Q845L2    70.1   - Q845L2 Hypothetical protein. 
Q9Z9S1    59.9   - Q9Z9S1 YlqF (BH2476 protein). 
Q4MFY1    65.9   - Q4MFY1 Hypothetical protein. 
Q5WFP0    53.5   - Q5WFP0 GTPase. 
Q5HPU7    50.9   - Q5HPU7 GTP-binding protein, putative. 
Q3XZB6    52.4   - Q3XZB6 GTP-binding protein. 
A0Q0X8    49.3   - A0Q0X8 GTP-binding protein, putative. 
Q03KX8    47.6   - Q03KX8 Predicted GTPase.

SymAtlas

SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mouse and human genomes. From the distribution of the human ortholog protein in the human body it is evident that this protein is widely distributed and expressed in virtually all human and mouse tissues therefore commonly used. These results are consistent with a protein essential for fundamental biological processes. The results do not provide a greater level of detail than this, however a GTPase would fall into this category HumanSymatlas.png


Though the distribution of the mouse ortholog protein is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues meaning that this is also consistent with our claim that 1pujA is a GTPase. MouseSymatlas.png

ProKnow

The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological function is small GTPase mediated signal transduction.

GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description 
                        Molecular Function
0005525	    0.9815	     2.8	     6	     GTP binding
0000166	    0.0101	     2.9	     6	     nucleotide binding
0008168	    0.0068	     2.9	     6	     methyltransferase activity
0003924	    0.0016	     2.5	     6	     GTPase activity
                        Biological Process
0007264	    1.0000	     2.8	     6	     small GTPase mediated signal transduction