Results 5
STRUCTURE
Quality of YlqF protein model and overall structure
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figure 1). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.
Table 1. Crystal parameters and refinement statistics
Parameters | Resolution[Å]
2.00 |
R factor, %
21.6 |
Free R factor, %
25.0 |
Space Group
P 21 21 21 |
Unit Cell | Length[Å]
Angles [°] |
a
alpha |
36.75
90.00 |
b
beta |
68.57
90.00 |
c
gamma |
105.57
90.00 |
---|
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA 61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA 121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF 181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA 241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM
Figure 1. Amino acid sequence of YlqF
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 2). YlqF protein consists of two domains. One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.
= pi helix, = 310 helix, = extended strand, = turn, = alpha helix,
Greyed out residues have no structural information
Figure 2. Sequence and Secondary Structure
Structure Analysis
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase.
Class
Alpha and beta proteins |
Fold
P-loop containing nucleoside triphosphate hydrolases |
Superfamily
P-loop containing nucleoside triphosphate hydrolases |
Family
G proteins |
Domain
Probable GTPase YlqF |
Species
Bacillus subtilis |
No Chain raw-score Z-score %id lali rmsd Description 1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF 2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG 3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A 4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A 5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A 7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA 8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A 9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR 10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA 11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE 12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE 13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE 14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ 15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A 16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE 17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN 19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA 20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3
Table . Dali search results of PDB/chain identifiers and structural alignment statistics
PYLOGENY
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (Bacillus subtilis) is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between.
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.
FUNCTIONAL ANALYSIS
ProFunc
From a summary of predicted function from our ProFunc results we can see that the most commonly occurring name term is GTPase, followed by GTP binding.
Of the results returned by ProFunc, Interpro, PDB, SSM and DALI and 3D functional template searches wew all significant however the most promising results came from UniProt.
UniProt
During the ProFunc analysis UniProt was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins.
Aligned sequences: %-tage Seq. id. identity Name -------- -------- ---- Query - - Query sequence O31743 95.6 - O31743 YlqF protein. Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). Q845L2 70.1 - Q845L2 Hypothetical protein. Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. Q5WFP0 53.5 - Q5WFP0 GTPase. Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. Q03KX8 47.6 - Q03KX8 Predicted GTPase.
SymAtlas
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mouse and human genomes. From the distribution of the human ortholog protein in the human body it is evident that this protein is widely distributed and expressed in virtually all human and mouse tissues therefore commonly used. These results are consistent with a protein essential for fundamental biological processes. The results do not provide a greater level of detail than this, however a GTPase would fall into this category
Though the distribution of the mouse ortholog protein is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues meaning that this is also consistent with our claim that 1pujA is a GTPase.
ProKnow
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological function is small GTPase mediated signal transduction.
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description Molecular Function 0005525 0.9815 2.8 6 GTP binding 0000166 0.0101 2.9 6 nucleotide binding 0008168 0.0068 2.9 6 methyltransferase activity 0003924 0.0016 2.5 6 GTPase activity Biological Process 0007264 1.0000 2.8 6 small GTPase mediated signal transduction