http://compbio.biosci.uq.edu.au/mediawiki/api.php?action=feedcontributions&user=ScottAllen&feedformat=atomMDWiki - User contributions [en]2024-03-29T11:28:56ZUser contributionsMediaWiki 1.39.6http://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Presentation5&diff=5584Presentation52007-06-12T08:39:44Z<p>ScottAllen: </p>
<hr />
<div>[[Media:talk.ppt]]</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Presentation5&diff=5583Presentation52007-06-12T08:34:16Z<p>ScottAllen: </p>
<hr />
<div>[[Media:Example.ogg]]</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Presentation5&diff=5582Presentation52007-06-12T08:33:46Z<p>ScottAllen: </p>
<hr />
<div>[[talk.ppt]]</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=File:Talk.ppt&diff=5581File:Talk.ppt2007-06-12T08:32:59Z<p>ScottAllen: </p>
<hr />
<div></div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Presentation5&diff=5580Presentation52007-06-12T08:26:44Z<p>ScottAllen: </p>
<hr />
<div>[[Media:Example.ogg]]</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Presentation5&diff=5579Presentation52007-06-12T08:25:59Z<p>ScottAllen: </p>
<hr />
<div>[http://www.example.com link title]</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=GTP_binding_protein&diff=5578GTP binding protein2007-06-12T08:25:32Z<p>ScottAllen: </p>
<hr />
<div>[[Library GTP binding protein]]<br />
<br />
[[Nucleotide Sequence]]<br />
<br />
[[Protein Sequence]]<br />
<br />
[[Group 5 Report]]<br />
<br />
[[FUNCTIONAL ANALYSIS]]<br />
<br />
[[EVOLUTION]]<br />
<br />
[[Presentation5]]</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=GTP_binding_protein&diff=5577GTP binding protein2007-06-12T08:25:16Z<p>ScottAllen: </p>
<hr />
<div>[[Library GTP binding protein]]<br />
<br />
[[Nucleotide Sequence]]<br />
<br />
[[Protein Sequence]]<br />
<br />
[[Group 5 Report]]<br />
<br />
[[FUNCTIONAL ANALYSIS]]<br />
<br />
[[EVOLUTION]]<br />
<br />
[[Presentation]]</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Abstract_5&diff=5475Abstract 52007-06-12T01:42:51Z<p>ScottAllen: </p>
<hr />
<div>YlqF are circularly permutated GTPases, as are the related YwaG family of proteins (Leipe, D. et. al. 2002). Recent studies have associated YlqF proteins with ribosomal assembly (Matsuo, Y. et. al. 2006) and the coiled coil domain of YwaG displays close homology with RNAse E (known RNA binding protein) (Anand, B. et. al. 2006). Computational biological methods were applied so as to make inferences on the strucrure, function and evolution of 1pujA (''B. subtilis'' YlqF protein). Structural and functional analyses confirmed the expected GTPase activity of YlqF. Building of a phylogenetic tree displayed the close relationship between YlqF and YawG. Multiple-sequence alignment (MSA) reveiled that YlqF lacks the coiled coil domain of YawG. The current study could not directly indicate YlqF's involvment in ribosomal assembly, however the compiled evidence implies strong conservation of primary and secondary structure between YlqF and YawG. Supportive of the hypothesis that YlqF may be involved in ribosomal assembly.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Abstract_5&diff=5470Abstract 52007-06-12T01:42:07Z<p>ScottAllen: </p>
<hr />
<div>YlqF are circularly permutated GTPases, as are the related YwaG family of proteins. Recent studies have associated YlqF proteins with ribosomal assembly (Matsuo, Y. et. al. 2006) and the coiled coil domain of YwaG displays close homology with RNAse E (known RNA binding protein) (Anand, B. 2006). Computational biological methods were applied so as to make inferences on the strucrure, function and evolution of 1pujA (''B. subtilis'' YlqF protein). Structural and functional analyses confirmed the expected GTPase activity of YlqF. Building of a phylogenetic tree displayed the close relationship between YlqF and YawG. Multiple-sequence alignment (MSA) reveiled that YlqF lacks the coiled coil domain of YawG. The current study could not directly indicate YlqF's involvment in ribosomal assembly, however the compiled evidence implies strong conservation of primary and secondary structure between YlqF and YawG. Supportive of the hypothesis that YlqF may be involved in ribosomal assembly.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Abstract_5&diff=5467Abstract 52007-06-12T01:41:08Z<p>ScottAllen: </p>
<hr />
<div>YlqF are circularly permutated GTPases, as are the related YwaG family of proteins. Recent studies have associated YlqF proteins with ribosomal assembly (Matsuo, Y. et. al. 2006) and the coiled coil domain of YwaG displays close homology with RNAse E (known RNA binding protein). Computational biological methods were applied so as to make inferences on the strucrure, function and evolution of 1pujA (''B. subtilis'' YlqF protein). Structural and functional analyses confirmed the expected GTPase activity of YlqF. Building of a phylogenetic tree displayed the close relationship between YlqF and YawG. Multiple-sequence alignment (MSA) reveiled that YlqF lacks the coiled coil domain of YawG. The current study could not directly indicate YlqF's involvment in ribosomal assembly, however the compiled evidence implies strong conservation of primary and secondary structure between YlqF and YawG. Supportive of the hypothesis that YlqF may be involved in ribosomal assembly.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Materials_and_Methods_5&diff=5430Materials and Methods 52007-06-12T01:25:16Z<p>ScottAllen: </p>
<hr />
<div>=Phylogeny=<br />
<br />
1PUJA FASTA sequence was obtained from NCBI Entrez Protein. This was used as a query to BLASTP the non-redundant protein database. Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic). MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif. The phylogenetic tree was constructed using the PROT-DIST (dayoff PAM distance matrix, 100 datasets, version 3.63), NEIGHBOUR (neighbour-joining method of tree construction, 100 datasets), and CONSENSE (default settings) programs of the PHYLIP package. Confidence in tree branches was determined by bootstrapping using SEQBOOT (100 resamplings of the PHYLIP analysis, default settings).<br />
<br />
<br />
=Structure=<br />
<br />
In order to determine the structure of YlqF protein various methods including several computational tools. The structure was first compared to all known structures. Proteins with similar structure to YlqF were determined by using Dali. Dali include PDB identification of similar structure, z score, RMSD, number of equivalent residues and so on. Z score is used to sort a list. The results of Dali was e-mailed. After determining similar structures, protein architectural classification was conducted using SCOP and CATH databases. SCOP database provided with a description of the structural and evolutionary relationships of the proteins of known structure. InterPro was used for determination of the protein domains. Pfam was also used to determine the protein family to which YlqF protein belongs to.With the knowledge from above research, the structure of YlqF was made using PyMol. <br />
<br />
<br />
=Function=<br />
<br />
<br />
To determine the function of 1pujA (and related YlqF proteins) a variety of data-mining strategies and computational tools were utilized. First a literature search was conducted using Google Scholar and PubMed to familiarize ourselves with YlqF proteins and to find any knowledge or additional information already in databases and published literature. From this point on we predicted a possible function for our protein and using this proceeded to test our hypothesis. This was done using the ProKnow and the Profunc web servers. ProKnow is a relatively simple database which determines the likely molecular function and biological process of a queried sequence. ProFunc is a server which was developed to help identify the likely function of a protein from its three-dimensional structure. It uses both sequence and structure based methods to try to provide evidence as to a protein's likely or possible function (REF= http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/). UniProt is a tool for performing a multiple sequence alignment on a submitted sequence. It was also important to investigate the function of any suspected orthologs or homologs of our protein. Mouse and human sequences were analysed with a server called SymAtlas, useful for determining the expression levels of a sequence in different tissues.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Introduction_5&diff=5425Introduction 52007-06-12T01:22:59Z<p>ScottAllen: </p>
<hr />
<div>==Introduction==<br />
<br />
Previous studies have located GTPases in a diverse array of bacteria and in all eukaryotes. GTPases are characterised by their use of GTP instead of ATP as a substrate. They are known to regulate many fundamental cellular processes such as translation, cell-signalling, intracellular trafficking and cytoskeletal re-organisation(Anand, B. et. al. 2006; Leipe, D. et. al. 2002). The NKxD and Walker B motifs of GTPases specify the utilisation of GTP. 1pujA is a known Ylqf GTPase of ''Bacillus subtilis'' (Matsuo, Y et. al. 2006). ''B. Subtilis'' is a gram positive, catalase positive bacterium commonly found in soil. ''B. Subtilis'' has also been referred to as ''Bacillus globigii'', ''Hay bacillus'' or ''Grass bacillus'' (Wiki 2007). YlqF has previously been associated with the assembly of the 50S ribosomal subunit. ''B. subtillus'' cells in which YlqF activity was inhibited showed slow growth rates and a build up of mis-folded 50S ribosomal subunits (Matsuo, Y et. al. 2006). A hypothesised circular permutation of the NKxD motif N-terminal of the Walker A motif (primary structure) is characteristic of GTPases of the Ylqf family, YwaG family, YqeH family and the YjeQ family of proteins(Anard, B. et. al. 2006; Leipe, D. et. al. 2002). Our aim for this study is to determine the overall function of 1pujA and related YlqF proteins. To do this we will investigate the function using stratagies such as literature searches, function prediction programs and programs that utilize additional high-throughput functional data. The structure and evolution of the protein will also be investigate with hopes that this additional information will help us better understand the function of YlqF proteins. Structure will be investigated using web tools specific for structure comparison and structure analysis. The structure of YlqF will be compared with proteins which are similar in sequences and structures. Finally Evolution will be determined using sequence searches, multiple sequence alignment and the building of a phylogenetic tree. Using data gathered from all these sources we hope to create a viable hypothesis for the function of YlqF proteins.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=References_5&diff=5420References 52007-06-12T01:21:00Z<p>ScottAllen: </p>
<hr />
<div>Anand, B., Verma, S., & Prakash, B. (2006). Structural stabilization of GTP-binding domains in circularly permuted GTPases: Implications for RNA binding. ''Nucleic Acids Research''. 34, 2196-2205.<br />
<br />
<br />
DeLano, W.L. (2002). The PyMOL Molecular Graphics System. DeLano Scientific, Palo Alto, CA, USA.<br />
<br />
<br />
PDB. (2007, 05) 1puj. Retrieved 6 1, 2007. Availbale from: http://www.rcsb.org/pdb/explore/explore.do?structureId=1PUJ<br />
<br />
<br />
Matsuo, Y., Morimoto, T., Kuwano, M., Chin Loh, P., Oshima, T., & Ogasawara, N. (2006). The GTP-binding Protein YlqF Participates in the Late Step of 50S Ribosomal Subunit Assembly in ''Bacillus subtilis''. ''Journal of Biological Chemistry''. 281. 12. 8110-8117.<br />
<br />
Leipe, D., Wolf, Y., Koonin, E. & Aravind, L. (2002). Classification and Evolution of P-loop GTPases and Related ATPases. ''Journal of Molecular Biology''. 317. 41-72.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Discussion_5&diff=5406Discussion 52007-06-12T01:14:09Z<p>ScottAllen: </p>
<hr />
<div>ProFunc returned a number of promising results including Interpro, PDB, SSM and DALI and 3D functional template searches. These results while not being significant did support our hypothesis. The result of the UniProt search however was significant. A multiple sequence alignment search returned 10 significant matches 2 of which had a percentage identity higher then 80%. Both of these sequences were listed as YlqF proteins. All of the proteins featured in these results were either GTPases or GTP-binding proteins. Therefore this sequence search supports our hypothesis of 1pujA being a GTPase.<br />
<br />
The SymAtlas results show the expression levels of ortologs for 1pujA in different tissues of both mice and humans. The results indicated that these proteins are widely distributed in all tissues of both animals. This is consistent with a protein essential for fundamental biological processes. Though the results do not provide a greater level of detail than this, a GTPase would fall into this category, further supporting our hypothesis.<br />
<br />
ProKnow assigns function by extracting and interpreting protein features from sequences and structure. It uses metaserver strategy through a knowledgebase of annotation profiles coupled with Bayesian scoring. (REF= http://www.doe-mbi.ucla.edu/Services/ProKnow/proknow.png) The results of our ProKnow search indicated that GTP binding was the molecular function and to a lesser degree nucleotide binding, methyltransferase activity and GTPase activity. The biological process was small GTPase mediated signal transduction, which refers to any series of molecular signals in which a small monomeric GTPase relays one or more of the signals. These ProKnow results also support our hypothesis that 1pujA is a GTPase.<br />
<br />
----<br />
<br />
Structural analysis shows that YlqF is likely to be a GTPase. A significant finding was seen with Dali search analysis. The analysis of Dali showed that many of the proteins with similar structrues to YlqF to be GTPases. However, Consensus protein fold classifiation determined by SCOP, CATH, and Dali showed different views of protein fold space. This is due to the fact that they use different methods to define and categorize protein folds. Pfam analysis also revealed two domains, MG442 and MMR_HSR1, which are both GTP binding proteins. The secondary structure analysis revealed that YlqF is 50% helical which is made up of 13 helices containing 142 amino acids and 10% beta sheet which is of 6 strands containing 31 amino acids. <br />
<br />
Other studies have shown that YlqF, the smallest cpGTPases in domain composition possess a CPG domain and a C-terminal alpha helical domain (Anand et al., 2006). They also suggest that YlqF lacks an N-terminal domain (Anand et al., 2006). Therefore, proteins like YlqF are the prototypes of cpGTPases (Anand et al., 2006). They are not likely to exist as single GTP binding domains (Anand et al., 2006). Studies suggest that cpGTPases require the presence of C-terminal domain in order for it to bind GTP (Anand et al., 2006). <br />
<br />
<br />
----<br />
<br />
<br />
P-loop NTPases are the most abundant proteins in cellular organisms, constituting 10-18% of all gene products. They are distinguished by the Walker A motif (consensus GxxxxGK[ST]), Walker B motif (consensus hhhDxxG, where h = hydrophobic residue), and the [NT]KxD motif which is unique to P-loop NTPases. The Walker B and [NT]KxD motifs indicate specificity towards GTP (Leipe, D. et. al. 2002). MSA of known and suspected GTPases displayed high conservation of these characterising motifs (Figures 8 & 9) indicating that 1pujA is likely a GTPase (Anand, B. 2006). Furthermore, all sequences aligned had the [NT]KxD motif circular permutation, indicative that they are all part of the YlqF/YwaG family within the P-loop NTPase superfamily (Leipe, D. et. al. 2002). It has previously been hypothesised that the Last Universal Common Ancestor (LUCA) to all extant life forms possessed several GTPases. If a particular GTPase family is widely represented in the three primary kingdoms (Archaea, Bacteria, and Eukaryota), this is evidence for presence in LUCA. This is supported if the phylogenetic tree conforms to the “standard model” topology, having bacterial and archeo-eukaryotic primary clades. Conversely, a different topology such as a bacteria-eukaryote grouping could indicate presence in LUCA but ancestral form in eukaryotes displaced by horizontal gene transfer (HGT) (Leipe, D. et. al. 2002). 1PUJA homologs/orthologs were found to be widely represented in the three primary kingdoms and conformed to the “standard model” topology (Figure 11). All of which is suggestive of the presence in LUCA. Ylqf proteins such as 1pujA may have been transferred from bacteria to eukaryotes once during the early stages of eukaryotic evolution (probably from the proto-mitochondrion), and a second time from chloroplasts to plants. It has been stipulated that GTPase activity is a result of adaptation. GTP is more constant within a cell and not subject to the same fluctuations as ATP. Hence specificity for use of GTP as a substrate was recruited in crucial functions such as translation (Leipe, D. et. al. 2002). The splitting of eukaryotes into two groups (Figure 11) is due to an addition N-terminal domain (Figure 10). This domain is a suspected coiled coil region associated with RNA and ribosmal attachment, distinguishes proteins of the YwaG family. It has been shown that YlqF and YwaG share GTP related domains (Figures 8 & 9) and C-terminal domains however YlqF lacks the N-terminal coiled coil regions of YwaG. Dispite this YlqF proteins are still expected to associated with RNA or ribosomes due to the shared C-terminal domains (Anand, B. et. al. 2006). The Phylogentic tree also suggests that YlqF is ancestral to YwaG. <br />
<br />
----<br />
<br />
<br />
Structural, functional and evolutionary analyses collectively indicate that 1pujA is a GTPase of the YlqF family. High conservation of YlqF over large spans of evolutionary time (Figure 11) indicate strong stabilising selection and suggest a role in one or several crucial biological processes (Leipe, D. et. al. 2002). Functional results indicate YlqF has a very important function inferred from the results of expression in all tissues of humans and mice. Although the current study can not directly show association of 1pujA with ribosomes, it is concluded that YlqF proteins are GTPase likely to be involved in a fundamental biological process such as translation. This study support of the work by Matsuo et. al. (2002) in which it was hypothesised that YlqF is needed for correct assembly of the 50S ribosomal subunit.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Discussion_5&diff=5405Discussion 52007-06-12T01:13:48Z<p>ScottAllen: </p>
<hr />
<div>'''A functional analysis of 1pujA indicated that our protein is a GTPase.'''<br />
<br />
ProFunc returned a number of promising results including Interpro, PDB, SSM and DALI and 3D functional template searches. These results while not being significant did support our hypothesis. The result of the UniProt search however was significant. A multiple sequence alignment search returned 10 significant matches 2 of which had a percentage identity higher then 80%. Both of these sequences were listed as YlqF proteins. All of the proteins featured in these results were either GTPases or GTP-binding proteins. Therefore this sequence search supports our hypothesis of 1pujA being a GTPase.<br />
<br />
The SymAtlas results show the expression levels of ortologs for 1pujA in different tissues of both mice and humans. The results indicated that these proteins are widely distributed in all tissues of both animals. This is consistent with a protein essential for fundamental biological processes. Though the results do not provide a greater level of detail than this, a GTPase would fall into this category, further supporting our hypothesis.<br />
<br />
ProKnow assigns function by extracting and interpreting protein features from sequences and structure. It uses metaserver strategy through a knowledgebase of annotation profiles coupled with Bayesian scoring. (REF= http://www.doe-mbi.ucla.edu/Services/ProKnow/proknow.png) The results of our ProKnow search indicated that GTP binding was the molecular function and to a lesser degree nucleotide binding, methyltransferase activity and GTPase activity. The biological process was small GTPase mediated signal transduction, which refers to any series of molecular signals in which a small monomeric GTPase relays one or more of the signals. These ProKnow results also support our hypothesis that 1pujA is a GTPase.<br />
<br />
----<br />
<br />
Structural analysis shows that YlqF is likely to be a GTPase. A significant finding was seen with Dali search analysis. The analysis of Dali showed that many of the proteins with similar structrues to YlqF to be GTPases. However, Consensus protein fold classifiation determined by SCOP, CATH, and Dali showed different views of protein fold space. This is due to the fact that they use different methods to define and categorize protein folds. Pfam analysis also revealed two domains, MG442 and MMR_HSR1, which are both GTP binding proteins. The secondary structure analysis revealed that YlqF is 50% helical which is made up of 13 helices containing 142 amino acids and 10% beta sheet which is of 6 strands containing 31 amino acids. <br />
<br />
Other studies have shown that YlqF, the smallest cpGTPases in domain composition possess a CPG domain and a C-terminal alpha helical domain (Anand et al., 2006). They also suggest that YlqF lacks an N-terminal domain (Anand et al., 2006). Therefore, proteins like YlqF are the prototypes of cpGTPases (Anand et al., 2006). They are not likely to exist as single GTP binding domains (Anand et al., 2006). Studies suggest that cpGTPases require the presence of C-terminal domain in order for it to bind GTP (Anand et al., 2006). <br />
<br />
<br />
----<br />
<br />
<br />
P-loop NTPases are the most abundant proteins in cellular organisms, constituting 10-18% of all gene products. They are distinguished by the Walker A motif (consensus GxxxxGK[ST]), Walker B motif (consensus hhhDxxG, where h = hydrophobic residue), and the [NT]KxD motif which is unique to P-loop NTPases. The Walker B and [NT]KxD motifs indicate specificity towards GTP (Leipe, D. et. al. 2002). MSA of known and suspected GTPases displayed high conservation of these characterising motifs (Figures 8 & 9) indicating that 1pujA is likely a GTPase (Anand, B. 2006). Furthermore, all sequences aligned had the [NT]KxD motif circular permutation, indicative that they are all part of the YlqF/YwaG family within the P-loop NTPase superfamily (Leipe, D. et. al. 2002). It has previously been hypothesised that the Last Universal Common Ancestor (LUCA) to all extant life forms possessed several GTPases. If a particular GTPase family is widely represented in the three primary kingdoms (Archaea, Bacteria, and Eukaryota), this is evidence for presence in LUCA. This is supported if the phylogenetic tree conforms to the “standard model” topology, having bacterial and archeo-eukaryotic primary clades. Conversely, a different topology such as a bacteria-eukaryote grouping could indicate presence in LUCA but ancestral form in eukaryotes displaced by horizontal gene transfer (HGT) (Leipe, D. et. al. 2002). 1PUJA homologs/orthologs were found to be widely represented in the three primary kingdoms and conformed to the “standard model” topology (Figure 11). All of which is suggestive of the presence in LUCA. Ylqf proteins such as 1pujA may have been transferred from bacteria to eukaryotes once during the early stages of eukaryotic evolution (probably from the proto-mitochondrion), and a second time from chloroplasts to plants. It has been stipulated that GTPase activity is a result of adaptation. GTP is more constant within a cell and not subject to the same fluctuations as ATP. Hence specificity for use of GTP as a substrate was recruited in crucial functions such as translation (Leipe, D. et. al. 2002). The splitting of eukaryotes into two groups (Figure 11) is due to an addition N-terminal domain (Figure 10). This domain is a suspected coiled coil region associated with RNA and ribosmal attachment, distinguishes proteins of the YwaG family. It has been shown that YlqF and YwaG share GTP related domains (Figures 8 & 9) and C-terminal domains however YlqF lacks the N-terminal coiled coil regions of YwaG. Dispite this YlqF proteins are still expected to associated with RNA or ribosomes due to the shared C-terminal domains (Anand, B. et. al. 2006). The Phylogentic tree also suggests that YlqF is ancestral to YwaG. <br />
<br />
----<br />
<br />
<br />
Structural, functional and evolutionary analyses collectively indicate that 1pujA is a GTPase of the YlqF family. High conservation of YlqF over large spans of evolutionary time (Figure 11) indicate strong stabilising selection and suggest a role in one or several crucial biological processes (Leipe, D. et. al. 2002). Functional results indicate YlqF has a very important function inferred from the results of expression in all tissues of humans and mice. Although the current study can not directly show association of 1pujA with ribosomes, it is concluded that YlqF proteins are GTPase likely to be involved in a fundamental biological process such as translation. This study support of the work by Matsuo et. al. (2002) in which it was hypothesised that YlqF is needed for correct assembly of the 50S ribosomal subunit.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Introduction_5&diff=5393Introduction 52007-06-12T00:59:09Z<p>ScottAllen: /* Introduction */</p>
<hr />
<div>==Introduction==<br />
<br />
Previous studies have located GTPases in a diverse array of bacteria and in all eukaryotes. GTPases are characterised by their use of GTP instead of ATP as a substrate. They are known to regulate many fundamental cellular processes such as translation, cell-signalling, intracellular trafficking and cytoskeletal re-organisation(Anand, B. et. al. 2006; Leipe, D. et. al. 2002). The NKxD and Walker B motifs of GTPases specify the utilisation of GTP. 1pujA is a known Ylqf GTPase of ''Bacillus subtilis'' (Matsuo, Y et. al. 2006). ''B. Subtilis'' is a gram positive, catalase positive bacterium commonly found in soil. ''B. Subtilis'' has also been referred to as ''Bacillus globigii'', ''Hay bacillus'' or ''Grass bacillus'' (Wiki 2007). YlqF has previously been associated with the assembly of the 50S ribosomal subunit. ''B. subtillus'' cells in which YlqF activity was inhibited showed slow growth rates and a build up of mis-folded 50S ribosomal subunits (Matsuo, Y et. al. 2006). A hypothesised circular permutation of the NKxD motif N-terminal of the Walker A motif (primary structure) is characteristic of GTPases of the Ylqf family, YwaG family, YqeH family and the YjeQ family of proteins(Anard, B. et. al. 2006; Leipe, D. et. al. 2002). Our aim for this study is to determine the overall function of 1pujA and related YlqF proteins. To do this we will investigate the function using stratagies such as literature searches, function prediction programs and programs that utilize additional high-throughput functional data. The structure and evolution of the protein will also be investigate with hopes that this additional information will help us better understand the function of YlqF proteins. Structure will be investigated using web tools specific for structure comparison and structure analysis. The structure of YlqF will be compared with proteins which are similar in sequences and structures. Finally Evolution will be determined using sequence searches, multiple sequence alignment and the building of a phylogenetic tree. Using data gathered from all these sources we hope to create a viable hypothesis for the function of YlqF proteins. ''In addition the prospect of YlqF’s involvement in 50S ribosomal assembly will be discussed.''</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Discussion_5&diff=5391Discussion 52007-06-12T00:56:50Z<p>ScottAllen: </p>
<hr />
<div>'''A functional analysis of 1pujA indicated that our protein is a GTPase.'''<br />
<br />
ProFunc returned a number of promising results including Interpro, PDB, SSM and DALI and 3D functional template searches. These results while not being significant did support our hypothesis. The result of the UniProt search however was significant. A multiple sequence alignment search returned 10 significant matches 2 of which had a percentage identity higher then 80%. Both of these sequences were listed as YlqF proteins. All of the proteins featured in these results were either GTPases or GTP-binding proteins. Therefore this sequence search supports our hypothesis of 1pujA being a GTPase.<br />
<br />
The SymAtlas results show the expression levels of ortologs for 1pujA in different tissues of both mice and humans. The results indicated that these proteins are widely distributed in all tissues of both animals. This is consistent with a protein essential for fundamental biological processes. Though the results do not provide a greater level of detail than this, a GTPase would fall into this category, further supporting our hypothesis.<br />
<br />
ProKnow assigns function by extracting and interpreting protein features from sequences and structure. It uses metaserver strategy through a knowledgebase of annotation profiles coupled with Bayesian scoring. (REF= http://www.doe-mbi.ucla.edu/Services/ProKnow/proknow.png) The results of our ProKnow search indicated that GTP binding was the molecular function and to a lesser degree nucleotide binding, methyltransferase activity and GTPase activity. The biological process was small GTPase mediated signal transduction, which refers to any series of molecular signals in which a small monomeric GTPase relays one or more of the signals. These ProKnow results also support our hypothesis that 1pujA is a GTPase.<br />
<br />
----<br />
<br />
Structural analysis shows that YlqF is likely to be a GTPase. A significant finding was seen with Dali search analysis. The analysis of Dali showed that many of the proteins with similar structrues to YlqF to be GTPases. However, Consensus protein fold classifiation determined by SCOP, CATH, and Dali showed different views of protein fold space. This is due to the fact that they use different methods to define and categorize protein folds. Pfam analysis also revealed two domains, MG442 and MMR_HSR1, which are both GTP binding proteins. The secondary structure analysis revealed that YlqF is 50% helical which is made up of 13 helices containing 142 amino acids and 10% beta sheet which is of 6 strands containing 31 amino acids. <br />
<br />
Other studies have shown that YlqF, the smallest cpGTPases in domain composition possess a CPG domain and a C-terminal alpha helical domain (Anand et al., 2006). They also suggest that YlqF lacks an N-terminal domain (Anand et al., 2006). Therefore, proteins like YlqF are the prototypes of cpGTPases (Anand et al., 2006). They are not likely to exist as single GTP binding domains (Anand et al., 2006). Studies suggest that cpGTPases require the presence of C-terminal domain in order for it to bind GTP (Anand et al., 2006). <br />
<br />
<br />
----<br />
<br />
<br />
P-loop NTPases are the most abundant proteins in cellular organisms, constituting 10-18% of all gene products. They are distinguished by the Walker A motif (consensus GxxxxGK[ST]), Walker B motif (consensus hhhDxxG, where h = hydrophobic residue), and the [NT]KxD motif which is unique to P-loop NTPases. The Walker B and [NT]KxD motifs indicate specificity towards GTP (Leipe, D. et. al. 2002). MSA of known and suspected GTPases displayed high conservation of these characterising motifs (Figures 8 & 9) indicating that 1pujA is likely a GTPase (Anand, B. 2006). Furthermore, all sequences aligned had the [NT]KxD motif circular permutation, indicative that they are all part of the YlqF/YwaG family within the P-loop NTPase superfamily (Leipe, D. et. al. 2002). It has previously been hypothesised that the Last Universal Common Ancestor (LUCA) to all extant life forms possessed several GTPases. If a particular GTPase family is widely represented in the three primary kingdoms (Archaea, Bacteria, and Eukaryota), this is evidence for presence in LUCA. This is supported if the phylogenetic tree conforms to the “standard model” topology, having bacterial and archeo-eukaryotic primary clades. Conversely, a different topology such as a bacteria-eukaryote grouping could indicate presence in LUCA but ancestral form in eukaryotes displaced by horizontal gene transfer (HGT) (Leipe, D. et. al. 2002). 1PUJA homologs/orthologs were found to be widely represented in the three primary kingdoms and conformed to the “standard model” topology (Figure 11). All of which is suggestive of the presence in LUCA. Ylqf proteins such as 1pujA may have been transferred from bacteria to eukaryotes once during the early stages of eukaryotic evolution (probably from the proto-mitochondrion), and a second time from chloroplasts to plants. It has been stipulated that GTPase activity is a result of adaptation. GTP is more constant within a cell and not subject to the same fluctuations as ATP. Hence specificity for use of GTP as a substrate was recruited in crucial functions such as translation (Leipe, D. et. al. 2002). The splitting of eukaryotes into two groups (Figure 11) is due to an addition N-terminal domain (Figure 10). This domain is a suspected coiled coil region associated with RNA and ribosmal attachment. <br />
<br />
----<br />
<br />
<br />
Structural, functional and evolutionary analyses collectively suggest that YlqF is a GTPase. High conservation of YlqF over large spans of evolutionary time (Figure ?) indicate strong stabilising selection and suggest a role in one or several crucial biological roles (Leipe, D. et. al. 2002). Functional results indicate YlqF has a very important function inferred from the results of expression in all tissues of humans and mice. In conclusion, YlqF is a GTPase likely to be involved in a fundamental biological process such as translation. In support of the study by Matsuo et. al. (2002) in which it was hypothesised that YlaF is needed for correct assembly of the 50S ribosomal subunit.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5362Results 52007-06-12T00:38:30Z<p>ScottAllen: /* PYLOGENY */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
'''Table 1.'''<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
'''Figure 1.'''<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
'''Figure 2.'''<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
'''Table 2.'''<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
'''Table 3.''' Crystal parameters and refinement statistics<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 3.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 4.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly consists of 50% helix (13 helices; 142 residues) and 10% beta sheets (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 6.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 7.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table 3.''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures 8 & 9) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure 8 shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure 9 that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. The phylogenetic tree (Figure 11) displayed three distinct groupings, a bacterial grouping (blue), an entirely eukaryotic grouping (green)and an archeo-eukaryotic grouping (red). Furhter analysis of the MSA reveiled that the spliting of eukaryotes was due to a suspected additional N-terminal domain. The archeo-eukaryotic grouping (Figure 11) appears to have an N-terminal coiled coil region (Figure 10) (Anand, B. et. al. 2006). The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
'''Figure 8.''' Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
'''Figure 9.''' From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:YawG MSA.jpg]]<br />
<br />
'''Figure 10.''' MSA of proteins displaying coiled coil domian of YawG proteins. Pink boxes are conserved regions within the coiled coil domain.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
'''Figure 11.''' Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=File:YawG_MSA.jpg&diff=5358File:YawG MSA.jpg2007-06-12T00:35:54Z<p>ScottAllen: </p>
<hr />
<div></div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5355Results 52007-06-12T00:35:21Z<p>ScottAllen: /* PYLOGENY */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
'''Table 1.'''<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
'''Figure 1.'''<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
'''Figure 2.'''<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
'''Table 2.'''<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
'''Table 3.''' Crystal parameters and refinement statistics<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 3.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 4.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly consists of 50% helix (13 helices; 142 residues) and 10% beta sheets (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 6.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 7.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table 3.''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures 8 & 9) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure 8 shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure 9 that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. The phylogenetic tree (Figure 11) displayed three distinct groupings, a bacterial grouping (blue), an entirely eukaryotic grouping (green)and an archeo-eukaryotic grouping (red). Furhter analysis of the MSA reveiled that the spliting of eukaryotes was due to a suspected additional N-terminal domain. The archeo-eukaryotic grouping (Figure 10) appears to have an N-terminal coiled coil region (Anand, B. et. al. 2006). The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
'''Figure 8.''' Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
'''Figure 9.''' From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:YawG MSA.jpg]]<br />
<br />
Figure 10. MSA of proteins displaying coiled coil domian of YawG proteins. Pink boxes are conserved regions within the coiled coil domain.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
'''Figure 11.''' Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5322Results 52007-06-11T23:58:12Z<p>ScottAllen: /* PYLOGENY */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
'''Table 1.'''<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
'''Figure 1.'''<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
'''Figure 2.'''<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
'''Table 2.'''<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
'''Table 3.''' Crystal parameters and refinement statistics<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 3.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 4.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly consists of 50% helix (13 helices; 142 residues) and 10% beta sheets (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 6.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 7.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table 3.''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures 8 & 9) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure 8 shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure 9 that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. The phylogenetic tree (Figure 10) displayed three distinct groupings, a bacterial grouping (blue), an entirely eukaryotic grouping (green)and an archeo-eukaryotic grouping (red). Furhter analysis of the MSA reveiled that the spliting of eukaryotes was due to a suspected additional N-terminal domain. The archeo-eukaryotic grouping (Figure ?) appears to have an N-terminal coiled coil region (Anand, B. et. al. 2006). The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
'''Figure 8.''' Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
'''Figure 9.''' From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
'''Figure 10.''' Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5309Results 52007-06-11T23:51:01Z<p>ScottAllen: /* PYLOGENY */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
'''Table 1.'''<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
'''Figure 1.'''<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
'''Figure 2.'''<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
'''Table 2.'''<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
'''Table 3.''' Crystal parameters and refinement statistics<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 3.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 4.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly consists of 50% helix (13 helices; 142 residues) and 10% beta sheets (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 6.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 7.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table 3.''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures 8 & 9) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure 8 shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure 9 that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. The phylogenetic tree (Figure 10) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
'''Figure 8.''' Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
'''Figure 9.''' From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
'''Figure 10.''' Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Discussion_5&diff=5263Discussion 52007-06-11T23:16:21Z<p>ScottAllen: </p>
<hr />
<div>'''A functional analysis of 1pujA indicated that our protein is a GTPase.'''<br />
<br />
ProFunc returned a number of promising results including Interpro, PDB, SSM and DALI and 3D functional template searches. These results while not being significant did support our hypothesis. The result of the UniProt search however was significant. A multiple sequence alignment search returned 10 significant matches 2 of which had a percentage identity higher then 80%. Both of these sequences were listed as YlqF proteins. All of the proteins featured in these results were either GTPases or GTP-binding proteins. Therefore this sequence search supports our hypothesis of 1pujA being a GTPase.<br />
<br />
The SymAtlas results show the expression levels of ortologs for 1pujA in different tissues of both mice and humans. The results indicated that these proteins are widely distributed in all tissues of both animals. This is consistent with a protein essential for fundamental biological processes. Though the results do not provide a greater level of detail than this, a GTPase would fall into this category, further supporting our hypothesis.<br />
<br />
ProKnow assigns function by extracting and interpreting protein features from sequences and structure. It uses metaserver strategy through a knowledgebase of annotation profiles coupled with Bayesian scoring. (REF= http://www.doe-mbi.ucla.edu/Services/ProKnow/proknow.png) The results of our ProKnow search indicated that GTP binding was the molecular function and to a lesser degree nucleotide binding, methyltransferase activity and GTPase activity. The biological process was small GTPase mediated signal transduction, which refers to any series of molecular signals in which a small monomeric GTPase relays one or more of the signals. These ProKnow results also support our hypothesis that 1pujA is a GTPase.<br />
<br />
----<br />
<br />
Structural analysis shows that YlqF is likely to be a GTPase. A significant finding was seen with Dali search analysis. The analysis of Dali showed that many of the proteins with similar structrues to YlqF to be GTPases. However, Consensus protein fold classifiation determined by SCOP, CATH, and Dali showed different views of protein fold space. This is due to the fact that they use different methods to define and categorize protein folds. The secondary structure analysis revealed that YlqF is 50% helical which is made up of 13 helices containing 142 amino acids and 10% beta sheet which is of 6 strands containing 31 amino acids. <br />
<br />
----<br />
<br />
<br />
P-loop NTPases are the most abundant proteins in cellular organisms, constituting 10-18% of all gene products. They are distinguished by the Walker A motif (consensus GxxxxGK[ST]), Walker B motif (consensus hhhDxxG, where h = hydrophobic residue), and the [NT]KxD motif which is unique to P-loop NTPases. The Walker B and [NT]KxD motifs indicate specificity towards GTP (Leipe, D. et. al. 2002). MSA of known and suspected GTPases displayed high conservation of these characterising motifs (Figures 8 & 9) indicating that 1pujA is likely a GTPase (Anand, B. 2006). Furthermore, all sequences aligned had the [NT]KxD motif circular permutation, indicative that they are all part of the YlqF family of the P-loop NTPase superfamily (Leipe, D. et. al. 2002). It has previously been hypothesised that the Last Universal Common Ancestor (LUCA) to all extant life forms possessed several GTPases. If a particular GTPase family is widely represented in the three primary kingdoms (Archaea, Bacteria, and Eukaryota), this is evidence for presence in LUCA. This is supported if the phylogenetic tree conforms to the “standard model” topology, having bacterial and archeo-eukaryotic primary clades. Conversely, a different topology such as a bacteria-eukaryote grouping could indicate presence in LUCA but ancestral form in eukaryotes displaced by horizontal gene transfer (HGT) (Leipe, D. et. al. 2002). 1PUJA was found to be widely represented in the three primary kingdoms and conformed to the “standard model” topology. All of which is suggestive of the presence in LUCA. Ylqf proteins such as 1pujA may have been transferred from bacteria to eukaryotes once during the early stages of eukaryotic evolution (probably from the proto-mitochondrion), and a second time from chloroplasts to plants. It has been stipulated that GTPase activity is a result of adaptation. GTP is more constant within a cell and not subject to the same fluctuations as ATP. Hence specificity for use of GTP as a substrate recruited in crucial functions such as translation (Leipe, D. et. al. 2002). <br />
<br />
----<br />
<br />
<br />
Structural, functional and evolutionary analyses collectively suggest that YlqF is a GTPase. High conservation of YlqF over large spans of evolutionary time (Figure ?) indicate strong stabilising selection and suggest a role in one or several crucial biological roles (Leipe, D. et. al. 2002). Functional results indicate YlqF has a very important function inferred from the results of expression in all tissues of humans and mice. In conclusion, YlqF is a GTPase likely to be involved in a fundamental biological process such as translation. In support of the study by Matsuo et. al. (2002) in which it was hypothesised that YlaF is needed for correct assembly of the 50S ribosomal subunit.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Introduction_5&diff=5251Introduction 52007-06-11T22:43:12Z<p>ScottAllen: </p>
<hr />
<div>==Introduction==<br />
<br />
Previous studies have located GTPases in a diverse array of bacteria and in all eukaryotes. GTPases are characterised by their use of GTP instead of ATP as a substrate. They are known to regulate many fundamental cellular processes such as translation, cell-signalling, intracellular trafficking and cytoskeletal re-organisation(Anand, B. et. al. 2006; Leipe, D. et. al. 2002). The NKxD and Walker B motifs of GTPases specify the utilisation of GTP. 1pujA is a known Ylqf GTPase of ''Bacillus subtilis'' (Matsuo, Y et. al. 2006). ''B. Subtilis'' is a gram positive, catalase positive bacterium commonly found in soil. ''B. Subtilis'' has also been referred to as ''Bacillus globigii'', ''Hay bacillus'' or ''Grass bacillus'' (Wiki 2007). YlqF has previously been associated with the assembly of the 50S ribosomal subunit. ''B. subtillus'' cells in which YlqF activity was inhibited showed slow growth rates and a build up of mis-folded 50S ribosomal subunits (Matsuo, Y et. al. 2006). A hypothesised circular permutation of the NKxD motif N-terminal of the Walker A motif (primary structure) is characteristic of GTPases of the Ylqf family (Leipe, D. et. al. 2002). Our aim for this study is to determine the overall function of 1pujA and related YlqF proteins. To do this we will investigate the function using stratagies such as literature searches, function prediction programs and programs that utilize additional high-throughput functional data. The structure and evolution of the protein will also be investigate with hopes that this additional information will help us better understand the function of YlqF proteins. Structure will be investigated using web tools specific for structure comparison and structure analysis. Finally Evolution will be determined using sequence searches, multiple sequence alignment and the building of a phylogenetic tree. Using data gathered from all these sources we hope to create a viable hypothesis for the function of YlqF proteins. ''In addition the prospect of YlqF’s involvement in 50S ribosomal assembly will be discussed.''<br />
<br />
<br />
structure...</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Discussion_5&diff=5237Discussion 52007-06-11T22:35:21Z<p>ScottAllen: </p>
<hr />
<div>'''A functional analysis of 1pujA indicated that our protein is a GTPase.'''<br />
<br />
ProFunc returned a number of promising results including Interpro, PDB, SSM and DALI and 3D functional template searches. These results while not being significant did support our hypothesis. The result of the UniProt search however was significant. A multiple sequence alignment search returned 10 significant matches 2 of which had a percentage identity higher then 80%. Both of these sequences were listed as YlqF proteins. All of the proteins featured in these results were either GTPases or GTP-binding proteins. Therefore this sequence search supports our hypothesis of 1pujA being a GTPase.<br />
<br />
The SymAtlas results show the expression levels of ortologs for 1pujA in different tissues of both mice and humans. The results indicated that these proteins are widely distributed in all tissues of both animals. This is consistent with a protein essential for fundamental biological processes. Though the results do not provide a greater level of detail than this, a GTPase would fall into this category, further supporting our hypothesis.<br />
<br />
ProKnow assigns function by extracting and interpreting protein features from sequences and structure. It uses metaserver strategy through a knowledgebase of annotation profiles coupled with Bayesian scoring. (REF= http://www.doe-mbi.ucla.edu/Services/ProKnow/proknow.png) The results of our ProKnow search indicated that GTP binding was the molecular function and to a lesser degree nucleotide binding, methyltransferase activity and GTPase activity. The biological process was small GTPase mediated signal transduction, which refers to any series of molecular signals in which a small monomeric GTPase relays one or more of the signals. These ProKnow results also support our hypothesis that 1pujA is a GTPase.<br />
<br />
----<br />
<br />
Structural analysis shows that YlqF is likely to be a GTPase. A significant finding was seen with Dali search analysis. The analysis of Dali showed that many of the proteins with similar structrues to YlqF to be GTPases. However, Consensus protein fold classifiation determined by SCOP, CATH, and Dali showed different views of protein fold space. This is due to the fact that they use different methods to define and categorize protein folds. The secondary structure analysis revealed that YlqF is 50% helical which is made up of 13 helices containing 142 amino acids and 10% beta sheet which is of 6 strands containing 31 amino acids. <br />
<br />
----<br />
<br />
<br />
P-loop NTPases are the most abundant proteins in cellular organisms, constituting 10-18% of all gene products. They are distinguished by the Walker A motif (consensus GxxxxGK[ST]), Walker B motif (consensus hhhDxxG, where h = hydrophobic residue), and the [NT]KxD motif which is unique to P-loop NTPases. The Walker B and [NT]KxD motifs indicate specificity towards GTP (Leipe, D. et. al. 2002). MSA of known and suspected GTPases displayed high conservation of these characterising motifs (Figures 8 & 9) indicating that 1pujA is likely a GTPase. Furthermore, all sequences aligned had the [NT]KxD motif circular permutation, indicative that they are all part of the YlqF family of the P-loop NTPase superfamily (Leipe, D. et. al. 2002). It has previously been hypothesised that the Last Universal Common Ancestor (LUCA) to all extant life forms possessed several GTPases. If a particular GTPase family is widely represented in the three primary kingdoms (Archaea, Bacteria, and Eukaryota), this is evidence for presence in LUCA. This is supported if the phylogenetic tree conforms to the “standard model” topology, having bacterial and archeo-eukaryotic primary clades. Conversely, a different topology such as a bacteria-eukaryote grouping could indicate presence in LUCA but ancestral form in eukaryotes displaced by horizontal gene transfer (HGT) (Leipe, D. et. al. 2002). 1PUJA was found to be widely represented in the three primary kingdoms and conformed to the “standard model” topology. All of which is suggestive of the presence in LUCA. Ylqf proteins such as 1pujA may have been transferred from bacteria to eukaryotes once during the early stages of eukaryotic evolution (probably from the proto-mitochondrion), and a second time from chloroplasts to plants. It has been stipulated that GTPase activity is a result of adaptation. GTP is more constant within a cell and not subject to the same fluctuations as ATP. Hence specificity for use of GTP as a substrate recruited in crucial functions such as translation (Leipe, D. et. al. 2002). <br />
<br />
----<br />
<br />
<br />
Structural, functional and evolutionary analyses collectively suggest that YlqF is a GTPase. High conservation of YlqF over large spans of evolutionary time (Figure ?) indicate strong stabilising selection and suggest a role in one or several crucial biological roles (Leipe, D. et. al. 2002). Functional results indicate YlqF has a very important function inferred from the results of expression in all tissues of humans and mice. In conclusion, YlqF is a GTPase likely to be involved in a fundamental biological process such as translation. In support of the study by Matsuo et. al. (2002) in which it was hypothesised that YlaF is needed for correct assembly of the 50S ribosomal subunit.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5224Results 52007-06-11T22:07:12Z<p>ScottAllen: </p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
'''Table 1.'''<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
'''Figure 1.'''<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
'''Figure 2.'''<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
'''Table 2.'''<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
'''Table 3.''' Crystal parameters and refinement statistics<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 3.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 4.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 6.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 7.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table 3.''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures 8 & 9) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure 8 shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure 9 that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
'''Figure 8.''' Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
'''Figure 9.''' From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure 10) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
'''Figure 10.''' Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5223Results 52007-06-11T22:05:26Z<p>ScottAllen: /* PYLOGENY */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
Table 1:<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
Figure 1:<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
Figure 2:<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
Table 2:<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
'''Table 3:''' Crystal parameters and refinement statistics<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 3.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 4.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 6.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 7.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table 3.''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures 8 & 9) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure 8 shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure 9 that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure 8: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure 9: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure 10) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure 10: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5222Results 52007-06-11T22:03:53Z<p>ScottAllen: /* STRUCTURE */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
Table 1:<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
Figure 1:<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
Figure 2:<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
Table 2:<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
'''Table 3:''' Crystal parameters and refinement statistics<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 3.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 4.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 6.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 7.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table 3.''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5221Results 52007-06-11T22:02:54Z<p>ScottAllen: /* STRUCTURE */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
Table 1:<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
Figure 1:<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
Figure 2:<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
Table 2:<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
'''Table 3:''' Crystal parameters and refinement statistics<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 1.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 2.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 3.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 4.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table .''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5220Results 52007-06-11T22:02:26Z<p>ScottAllen: /* '''Quality of YlqF protein model and overall structure''' */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
Table 1:<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
Figure 1:<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
Figure 2:<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
Table 2:<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
'''Table 1.''' Crystal parameters and refinement statistics<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 1.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 2.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 3.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 4.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table .''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5219Results 52007-06-11T22:01:06Z<p>ScottAllen: /* ProKnow */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
Table 1:<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
Figure 1:<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
Figure 2:<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
Table 2:<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
<br />
'''Table 1.''' Crystal parameters and refinement statistics<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 1.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 2.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 3.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 4.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table .''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5218Results 52007-06-11T22:00:38Z<p>ScottAllen: /* UniProt */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
Table 1:<br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
Figure 1:<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
Figure 2:<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
<br />
'''Table 1.''' Crystal parameters and refinement statistics<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 1.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 2.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 3.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 4.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table .''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5217Results 52007-06-11T22:00:17Z<p>ScottAllen: /* SymAtlas */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
Figure 1:<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
Figure 2:<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
<br />
'''Table 1.''' Crystal parameters and refinement statistics<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 1.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 2.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 3.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 4.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table .''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5216Results 52007-06-11T21:59:57Z<p>ScottAllen: /* SymAtlas */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
Figure 1:<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
Figure 2:<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
<br />
'''Table 1.''' Crystal parameters and refinement statistics<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 1.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 2.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 3.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 4.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table .''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5215Results 52007-06-11T21:59:41Z<p>ScottAllen: /* SymAtlas */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
==SymAtlas==<br />
<br />
SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
Figure 1:<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
Figure 2:<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
<br />
'''Table 1.''' Crystal parameters and refinement statistics<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 1.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 2.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 3.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 4.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table .''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5214Results 52007-06-11T21:57:09Z<p>ScottAllen: /* ProFunc */</p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM, DALI and 3D functional template searches were all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
==SymAtlas==<br />
<br />
'''SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.'''<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figures 1 & 2). It also contains ligands which are a magnesium ion and a Phosphoaminophosphonic acid-guanylate ester (Figure 3). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
<br />
'''Table 1.''' Crystal parameters and refinement statistics<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 1.''' Amino acid sequence of YlqF<br />
<br />
[[Image:cartoon.jpg]]<br />
<br />
'''Figure 2.''' Structure of YlqF showing helix, sheet, and loop. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
[[Image:ligand.jpg]]<br />
<br />
'''Figure 3.''' Structure of YlqF showing ligands. The image was obtained from PDB.<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 3). YlqF protein consists of two domains (Figure 4). One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 4.''' Sequence and Secondary Structure<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
'''Figure 5.''' Structure of YlqF showing two domains. Image was generated using the program PyMol (DeLano, 2002)<br />
<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
The search results of Dali show that most of them are Ras proteins which are small GTPases. This could suggest that YlqF protein might be a GTPase.<br />
<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table .''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
====InterProScan Results====<br />
<br />
'''InterPro: IPR005289'''<br />
MG442<br />
This is a GTP binding domain. This was found in many different families including Ras GTPase superfamily, HSR1-related GTP binding protein. And they all play different functions.<br />
<br />
'''InterPro: IPR002917'''<br />
MMR_HSR1<br />
HSR1 is placed to the human MHC class I region. It is known to be highly homologous to a putative GTP binding protein like MMR1 from mouse. These are also known to represent a new subfamily of GTP binding proteins of prokaryote and eukaryote members.<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Introduction_5&diff=5213Introduction 52007-06-11T21:56:02Z<p>ScottAllen: </p>
<hr />
<div>==Introduction==<br />
<br />
Previous studies have located GTPases in a diverse array of bacteria and in all eukaryotes. GTPases are characterised by their use of GTP instead of ATP as a substrate. They are known to regulate many fundamental cellular processes such as translation, cell-signalling, intracellular trafficking and cytoskeletal re-organisation. The NKxD and Walker B motifs of GTPases specify the utilisation of GTP. 1pujA is a known Ylqf GTPase of ''Bacillus subtilis'' (Matsuo, Y et. al. 2006). ''B. Subtilis'' is a gram positive, catalase positive bacterium commonly found in soil. ''B. Subtilis'' has also been referred to as ''Bacillus globigii'', ''Hay bacillus'' or ''Grass bacillus'' (Wiki 2007). YlqF has previously been associated with the assembly of the 50S ribosomal subunit. ''B. subtillus'' cells in which YlqF activity was inhibited showed slow growth rates and a build up of mis-folded 50S ribosomal subunits (Matsuo, Y et. al. 2006). A hypothesised circular permutation of the NKxD motif N-terminal of the Walker A motif (primary structure) is characteristic of GTPases of the Ylqf family (Leipe, D. et. al. 2002). Our aim for this study is to determine the overall function of 1pujA and related YlqF proteins. To do this we will investigate the function using stratagies such as literature searches, function prediction programs and programs that utilize additional high-throughput functional data. The structure and evolution of the protein will also be investigate with hopes that this additional information will help us better understand the function of YlqF proteins. Structure will be investigated using web tools specific for structure comparison and structure analysis. Finally Evolution will be determined using sequence searches, multiple sequence alignment and the building of a phylogenetic tree. Using data gathered from all these sources we hope to create a viable hypothesis for the function of YlqF proteins. ''In addition the prospect of YlqF’s involvement in 50S ribosomal assembly will be discussed.''<br />
<br />
<br />
structure...</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Materials_and_Methods_5&diff=5083Materials and Methods 52007-06-11T13:35:59Z<p>ScottAllen: </p>
<hr />
<div>=Phylogeny=<br />
<br />
1PUJA FASTA sequence was obtained from NCBI Entrez Protein. This was used as a query to BLASTP the non-redundant protein database. Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic) (Hedges, S. 2002). MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif. The phylogenetic tree was constructed using the PROT-DIST (dayoff PAM distance matrix, 100 datasets, version 3.63), NEIGHBOUR (neighbour-joining method of tree construction, 100 datasets), and CONSENSE (default settings) programs of the PHYLIP package. Confidence in tree branches was determined by bootstrapping using SEQBOOT (100 resamplings of the PHYLIP analysis, default settings).<br />
<br />
<br />
=Structure=<br />
<br />
<br />
=Function=<br />
<br />
<br />
To determine the function of 1pujA (and related YlqF proteins) a variety of data-mining strategies and computational tools were utilized. First a literature search was conducted using Google Scholar and PubMed to familiarize ourselves with YlqF proteins and to find any knowledge or additional information already in databases and published literature. From this point on we predicted a possible function for our protein and using this proceeded to test our hypothesis. This was done using the ProKnow and the Profunc web servers. ProKnow is a relatively simple database which determines the likely molecular function and biological process of a queried sequence. ProFunc is a server which was developed to help identify the likely function of a protein from its three-dimensional structure. It uses both sequence and structure based methods to try to provide evidence as to a protein's likely or possible function (REF= http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/). UniProt is a tool for performing a multiple sequence alignment on a submitted sequence. It was also important to investigate the function of any suspected orthologs or homologs of our protein. Mouse and human sequences were analysed with a server called SymAtlas, useful for determining the expression levels of a sequence in different tissues.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Materials_and_Methods_5&diff=5079Materials and Methods 52007-06-11T13:32:41Z<p>ScottAllen: </p>
<hr />
<div>=Phylogeny=<br />
<br />
1PUJA FASTA sequence was obtained from NCBI Entrez Protein. This was used as a query to BLASTP the non-redundant protein database. Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic) (Hedges, S. 2002). MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif. The phylogenetic tree was constructed using the PROT-DIST (dayoff PAM distance matrix, 100 datasets, version 3.63), NEIGHBOUR (neighbour-joining method of tree construction, 100 datasets), and CONSENSE programs of the PHYLIP package. Confidence in tree branches was determined by bootstrapping using SEQBOOT (100 resamplings of the PHYLIP analysis, default settings).<br />
<br />
<br />
=Structure=<br />
<br />
<br />
=Function=<br />
<br />
<br />
To determine the function of 1pujA (and related YlqF proteins) a variety of data-mining strategies and computational tools were utilized. First a literature search was conducted using Google Scholar and PubMed to familiarize ourselves with YlqF proteins and to find any knowledge or additional information already in databases and published literature. From this point on we predicted a possible function for our protein and using this proceeded to test our hypothesis. This was done using the ProKnow and the Profunc web servers. ProKnow is a relatively simple database which determines the likely molecular function and biological process of a queried sequence. ProFunc is a server which was developed to help identify the likely function of a protein from its three-dimensional structure. It uses both sequence and structure based methods to try to provide evidence as to a protein's likely or possible function (REF= http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/). UniProt is a tool for performing a multiple sequence alignment on a submitted sequence. It was also important to investigate the function of any suspected orthologs or homologs of our protein. Mouse and human sequences were analysed with a server called SymAtlas, useful for determining the expression levels of a sequence in different tissues.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Results_5&diff=5009Results 52007-06-11T12:36:39Z<p>ScottAllen: </p>
<hr />
<div>=FUNCTIONAL ANALYSIS=<br />
<br />
<br />
==ProFunc==<br />
<br />
From a summary of predicted function from our [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay99&code=053949 ProFunc results] we can see that the most commonly occurring name term is GTPase, followed by GTP binding.<br />
<br />
Of the results returned by ProFunc, Interpro, PDB, SSM and DALI and 3D functional template searches wew all significant however the most promising results came from UniProt.<br />
<br />
==UniProt==<br />
<br />
During the ProFunc analysis [http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/runprog.pl?prog=funchtml&html_name=seq001_aln.html&pdbcode=ay99&user_id=&pdb_type=PROFUNC&code=053949 UniProt] was used to perform a multiple sequence alignment on our protein sequence. The top 2 results were both highly conserved as well as also being YlqF proteins. The rest of the 10 proteins listed were all either GTPase or GTP binding proteins. <br />
<br />
<pre>Aligned sequences:<br />
%-tage<br />
Seq. id. identity Name<br />
-------- -------- ----<br />
Query - - Query sequence <br />
O31743 95.6 - O31743 YlqF protein. <br />
Q65JP4 84.6 - Q65JP4 YlqF (GTP-binding domain protein). <br />
Q845L2 70.1 - Q845L2 Hypothetical protein. <br />
Q9Z9S1 59.9 - Q9Z9S1 YlqF (BH2476 protein). <br />
Q4MFY1 65.9 - Q4MFY1 Hypothetical protein. <br />
Q5WFP0 53.5 - Q5WFP0 GTPase. <br />
Q5HPU7 50.9 - Q5HPU7 GTP-binding protein, putative. <br />
Q3XZB6 52.4 - Q3XZB6 GTP-binding protein. <br />
A0Q0X8 49.3 - A0Q0X8 GTP-binding protein, putative. <br />
Q03KX8 47.6 - Q03KX8 Predicted GTPase.</pre><br />
<br />
==SymAtlas==<br />
<br />
'''SymAtlas is a website useful for researching the expression levels of a sequence in different tissues. Using this database we have explored the levels of orthologs of 1pujA in both mice and humans.'''<br />
<br />
From the [http://symatlas.gnf.org/SymAtlas/symquery?q=92170 distribution of the human ortholog protein in the human body] it is evident that this protein is widely expressed in virtually all human tissues. The results do not provide a greater level of detail than this.<br />
<br />
[[Image:HumanSymatlas.png]]<br />
<br />
<br />
Though the [http://symatlas.gnf.org/SymAtlas/symquery?q=212508 distribution of the mouse ortholog protein] is somewhat more scattered than the human ortholog protein it is still relatively common in all major tissues <br />
<br />
<br />
[[Image:MouseSymatlas.png]]<br />
<br />
==ProKnow==<br />
<br />
The ProKnow analysis shows that 1pujA's molecular function is GTP binding and its biological process is small GTPase mediated signal transduction.<br />
<br />
<pre><br />
GO Code.Bayesian Score.Evidence Rank.Number of Clues.Description <br />
Molecular Function<br />
0005525 0.9815 2.8 6 GTP binding<br />
0000166 0.0101 2.9 6 nucleotide binding<br />
0008168 0.0068 2.9 6 methyltransferase activity<br />
0003924 0.0016 2.5 6 GTPase activity<br />
Biological Process<br />
0007264 1.0000 2.8 6 small GTPase mediated signal transduction</pre><br />
<br />
<br />
<br />
=STRUCTURE=<br />
==='''Quality of YlqF protein model and overall structure'''===<br />
<br />
The asymmetric unit of Bacillus subtilis YlqF protein consists of a polymer containing 282 amino acids (Figure 1). The protein has been refined at 2 angstroms to a crystallographic R factor of 21.6% and free R factor of 25%. Table 1 summarizes the refinement statistics including protein quality parameters. MolProbity Ramachandran analysis of YlqF shows that 96.5% of all residues lie in the favoured regions and 98.8% of all residues lie in the allowed regions.<br />
<br />
<br />
<br />
'''Table 1.''' Crystal parameters and refinement statistics<br />
<br />
{| border="1"<br />
|-<br />
| '''Parameters''' || Resolution[Å] <br />
2.00 <br />
|R factor, %<br />
21.6<br />
|Free R factor, %<br />
25.0<br />
|Space Group<br />
P 21 21 21 <br />
|-<br />
|}<br />
<br />
{| border="1"<br />
|-<br />
! '''Unit Cell'''<br />
|Length[Å] <br />
Angles [°]<br />
|a<br />
alpha<br />
|36.75<br />
90.00<br />
|b<br />
beta<br />
|68.57<br />
90.00 <br />
|c <br />
gamma <br />
|105.57 <br />
90.00 <br />
|}<br />
<br />
1 - MTIQWFPGHM AKARREVTEK LKLIDIVYEL VDARIPMSSR NPMIEDILKN KPRIMLLNKA<br />
61 - DKADAAVTQQ WKEHFENQGI RSLSINSVNG QGLNQIVPAS KEILQEKFDR MRAKGVKPRA<br />
121 - IRALIIGIPN VGKSTLINRL AKKNIAKTGD RPGITTSQQW VKVGKELELL DTPGILWPKF<br />
181 - EDELVGLRLA VTGAIKDSII NLQDVAVFGL RFLEEHYPER LKERYGLDEI PEDIAELFDA<br />
241 - IGEKRGCLMS GGLINYDKTT EVIIRDIRTE KFGRLSFEQP TM<br />
<br />
'''Figure 1.''' Amino acid sequence of YlqF<br />
<br />
[[Image:domain1.jpg]]<br />
<br />
<br />
The secondary structure of YlqF mainly contains 50% helical (13 helices; 142 residues) and 10% beta sheet (6 strands; 31 residues)(see Figure 2). YlqF protein consists of two domains. One domain contains Rossmann fold with α/β class. This domain possesses 1-177 residues, and forms a 3-layer sandwich structure. The other one is referred to as a conserved hypothetical protein with mainly α class. This possesses 178-282 residues, and forms a orthogonal bundle structure. YlqF is also classified as a signalling protein. The molecular weight of the protein is 31986 Da.<br />
<br />
<br />
[[Image:Secondary_structure.png]]<br />
<br />
[[Image:pi helix.gif]] = pi helix, [[Image:310helix.jpg]] = 310 helix, [[Image:sheet.jpg]] = extended strand, [[Image:turn.jpg]] = turn, [[Image:alpha.jpg]] = alpha helix, <br />
<br />
Greyed out residues have no structural information<br />
<br />
'''Figure 2.''' Sequence and Secondary Structure<br />
<br />
<br />
===Structure Analysis===<br />
Structure classification of proteins (SCOP) classified YlqF as shown in Table 2. The results of Pfam classification described YlqF as GTPase of unknown function, Rhomboid family, and catalytic domain of alpha amylase. <br />
<br />
{| border="1"<br />
|-<br />
|'''Class'''<br />
Alpha and beta proteins <br />
|'''Fold'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases<br />
|'''Superfamily'''<br />
P-loop containing <br />
<br />
nucleoside triphosphate hydrolases <br />
|'''Family''' <br />
G proteins<br />
|'''Domain''' <br />
Probable GTPase YlqF<br />
|'''Species''' <br />
Bacillus subtilis <br />
|-<br />
|}<br />
<br />
No Chain raw-score Z-score %id lali rmsd Description<br />
1 1pujA 3946.0 42.7 100 261 0.0 CONSERVED HYPOTHETICAL PROTEIN YLQF <br />
2 1udxA 618.8 6.2 14 141 25.5 THE GTP-BINDING PROTEIN OBG <br />
3 1uadA 541.8 6.1 9 99 2.6 RAS-RELATED PROTEIN RAL-A <br />
4 1zbdA 553.7 6.1 12 99 2.8 RAB-3A <br />
5 1g17A 526.0 5.9 18 95 2.5 RAS-RELATED PROTEIN SEC4 <br />
6 1c1yA 515.7 5.8 12 96 2.8 RAS-RELATED PROTEIN RAP-1A <br />
7 1kjzA 571.3 5.7 14 114 4.0 EIF2GAMMA <br />
8 1oixA 510.9 5.4 11 94 2.6 RAS-RELATED PROTEIN RAB-11A <br />
9 1ukvY 519.2 5.3 10 99 2.8 SECRETORY PATHWAY GDP DISSOCIATION INHIBITOR <br />
10 1ijfA 535.8 5.3 13 103 3.6 ELONGATION FACTOR 1-ALPHA <br />
11 1gpmA 481.7 5.2 7 117 4.0 GMP SYNTHETASE <br />
12 1hrkA 483.7 5.1 5 116 4.2 FERROCHELATASE <br />
13 1d5cA 474.4 5.1 10 92 2.6 RAB6 GTPASE <br />
14 1ni5A 512.4 5.0 6 115 5.5 PUTATIVE CELL CYCLE PROTEIN MESJ <br />
15 3rapS 484.1 5.0 9 92 2.6 G PROTEIN RAP2A <br />
16 1gwnA 475.5 5.0 10 94 2.7 RHO-RELATED GTP-BINDING PROTEIN RHOE <br />
17 1vg8A 479.0 5.0 11 106 3.9 RAS-RELATED PROTEIN RAB-7 <br />
18 1cqxA 438.8 4.9 4 116 4.2 FLAVOHEMOPROTEIN <br />
19 1mkyA 460.7 4.7 11 100 8.8 PROBABLE GTP-BINDING PROTEIN ENGA <br />
20 1fzqA 463.8 4.7 14 92 2.6 ADP-RIBOSYLATION FACTOR-LIKE PROTEIN 3<br />
<br />
'''Table .''' Dali search results of PDB/chain identifiers and structural alignment statistics<br />
<br />
=PYLOGENY=<br />
<br />
The MSA (Figures ? & ?) displays several regions of high conservation amongst bacteria, archaea and eukaryotes. Figure ? shows that the NKxD motif of 1PUJA (''Bacillus subtilis'') is N-termainal of the Walker A motif and is conserved across the three kingdoms. It can also be seen in Figure ? that the Walker A and Walker B motifs are conserved, along with a conserved Threonine in-between. <br />
<br />
[[Image:NKxD5.jpg]]<br />
<br />
Figure ?: Clustal X MSA showing conservation of N-terminal NKxD motif. Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
[[Image:WalkerAB3.jpg]]<br />
<br />
Figure ?: From left to right (N-terminal to C-terminal), Walker A motif (consensus GxxxxGK[ST]), conserved Threonine in-between Walker A and Walker B, and the Walker B motif (consensus DxPG). Note the conservation across bacteria (blue), archaea (red) and eukaryotes (green).<br />
<br />
<br />
<br />
<br />
The phylogenetic tree (Figure ?) displayed three distinct groupings, a bacterial grouping, an entirely eukaryotic grouping and an archeo-eukaryotic grouping. The tree conforms to the “standard topology” of bacterial and archeo-eukaryotic clades (Leipe, D. et. al. 2002) with no evidence of horizontal gene transfer.<br />
<br />
[[Image:TREEcircle.jpg]]<br />
<br />
Figure ?: Phylogenetic tree of putative and known GTPases. Bacterial group (blue), eukaryotic group (green) and archeo-eukaryotic group (red). Red stars denote less then 95% confidence in branch as determined via 100 bootstraps.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Materials_and_Methods_5&diff=4991Materials and Methods 52007-06-11T12:24:23Z<p>ScottAllen: </p>
<hr />
<div>=Phylogeny=<br />
<br />
1PUJA FASTA sequence was obtained from NCBI Entrez Protein. This was used as a query to BLASTP the non-redundant protein database. Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic) (Hedges, S. 2002). MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif. The phylogenetic tree was constructed using the PROT-DIST (dayoff PAM distance matrix, 100 datasets, version 3.63), NEIGHBOUR (neighbour-joining method of tree construction), and CONSENSE programs of the PHYLIP package. Confidence in tree branches was determined by bootstrapping using SEQBOOT (100 resamplings of the PHYLIP analysis, default settings).<br />
<br />
<br />
=Structure=<br />
<br />
<br />
=Function=<br />
<br />
<br />
To determine the function of 1pujA (and related YlqF proteins) a variety of data-mining strategies and computational tools were utilized. First a literature search was conducted using Google Scholar and PubMed to familiarize ourselves with YlqF proteins and to find any knowledge or additional information already in databases and published literature. From this point on we predicted a possible function for our protein and using this proceeded to test our hypothesis. This was done using the ProKnow and the Profunc web servers. ProKnow is a relatively simple database which determines the likely molecular function and biological process of a queried sequence. ProFunc is a server which was developed to help identify the likely function of a protein from its three-dimensional structure. It uses both sequence and structure based methods to try to provide evidence as to a protein's likely or possible function (REF= http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/). UniProt is a tool for performing a multiple sequence alignment on a submitted sequence. It was also important to investigate the function of any suspected orthologs or homologs of our protein. Mouse and human sequences were analysed with a server called SymAtlas, useful for determining the expression levels of a sequence in different tissues.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Materials_and_Methods_5&diff=4985Materials and Methods 52007-06-11T12:16:59Z<p>ScottAllen: </p>
<hr />
<div>=Phylogeny=<br />
<br />
1PUJA FASTA sequence was obtained from NCBI Entrez Protein. This was used as a query to BLASTP the non-redundant protein database. Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic) (Hedges, S. 2002). MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif. The phylogenetic tree was constructed using the PROT-DIST (dayoff PAM distance matrix, 100 datasets), NEIGHBOUR (neighbour-joining method of tree construction), and CONSENSE programs of the PHYLIP package. Confidence in tree branches was determined by bootstrapping using SEQBOOT (100 resamplings of the PHYLIP analysis, default settings).<br />
<br />
<br />
=Structure=<br />
<br />
<br />
=Function=<br />
<br />
<br />
To determine the function of 1pujA (and related YlqF proteins) a variety of data-mining strategies and computational tools were utilized. First a literature search was conducted using Google Scholar and PubMed to familiarize ourselves with YlqF proteins and to find any knowledge or additional information already in databases and published literature. From this point on we predicted a possible function for our protein and using this proceeded to test our hypothesis. This was done using the ProKnow and the Profunc web servers. ProKnow is a relatively simple database which determines the likely molecular function and biological process of a queried sequence. ProFunc is a server which was developed to help identify the likely function of a protein from its three-dimensional structure. It uses both sequence and structure based methods to try to provide evidence as to a protein's likely or possible function (REF= http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/). UniProt is a tool for performing a multiple sequence alignment on a submitted sequence. It was also important to investigate the function of any suspected orthologs or homologs of our protein. Mouse and human sequences were analysed with a server called SymAtlas, useful for determining the expression levels of a sequence in different tissues.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Materials_and_Methods_5&diff=4982Materials and Methods 52007-06-11T12:15:00Z<p>ScottAllen: </p>
<hr />
<div>=Phylogeny=<br />
<br />
1PUJA FASTA sequence was obtained from NCBI Entrez Protein. This was used as a query to BLASTP the non-redundant protein database. Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic) (Hedges, S. 2002). MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif. The phylogenetic tree was constructed using the PROT-DIST (distance matrix construction), NEIGHBOUR (neighbour-joining method of tree construction), and CONSENSE programs of the PHYLIP package. Confidence in tree branches was determined by bootstrapping using SEQBOOT (100 resamplings of the PHYLIP analysis, default settings).<br />
<br />
<br />
=Structure=<br />
<br />
<br />
=Function=<br />
<br />
<br />
To determine the function of 1pujA (and related YlqF proteins) a variety of data-mining strategies and computational tools were utilized. First a literature search was conducted using Google Scholar and PubMed to familiarize ourselves with YlqF proteins and to find any knowledge or additional information already in databases and published literature. From this point on we predicted a possible function for our protein and using this proceeded to test our hypothesis. This was done using the ProKnow and the Profunc web servers. ProKnow is a relatively simple database which determines the likely molecular function and biological process of a queried sequence. ProFunc is a server which was developed to help identify the likely function of a protein from its three-dimensional structure. It uses both sequence and structure based methods to try to provide evidence as to a protein's likely or possible function (REF= http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/). UniProt is a tool for performing a multiple sequence alignment on a submitted sequence. It was also important to investigate the function of any suspected orthologs or homologs of our protein. Mouse and human sequences were analysed with a server called SymAtlas, useful for determining the expression levels of a sequence in different tissues.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Introduction_5&diff=4893Introduction 52007-06-11T10:11:21Z<p>ScottAllen: </p>
<hr />
<div>==Introduction==<br />
<br />
Previous studies have located GTPases in a diverse array of bacteria and in all eukaryotes. GTPases are characterised by their use of GTP instead of ATP as a substrate. They are known to regulate many fundamental cellular processes such as translation, cell-signalling, intracellular trafficking and cytoskeletal re-organisation. The NKxD and Walker B motifs of GTPases specify the utilisation of GTP. 1pujA is a known Ylqf GTPase of ''Bacillus subtilis'' (Matsuo, Y et. al. 2006). ''B. Subtilis'' is a gram positive, catalase positive bacterium commonly found in soil. ''B. Subtilis'' has also been reffered to as ''Bacillus globigii'', ''Hay bacillus'' or ''Grass bacillus'' (Wiki 2007). YlqF has previously been associated with the assembly of the 50S ribosomal subunit. ''B. subtillus'' cells in which YlqF activity was inhibited showed slow growth rates and a build up of mis-folded 50S ribosomal subunits (Matsuo, Y et. al. 2006). A hypothesised circular permutation of the NKxD motif N-terminal of the Walker A motif (primary structure) is characteristic of GTPases of the Ylqf family (Leipe, D. et. al. 2002). Our aim for this study is to determine the overall function of 1pujA and related YlqF proteins. To do this we will investigate the function using stratagies such as literature searches, function prediction programs and programs that utilize additional high-throughput functional data. The structure and evolution of the protein will also be investigate with hopes that this additional information will help us better understand the function of YlqF proteins. Structure will be investigated using web tools specific for structure comparison and structure analysis. Finally Evolution will be determined using sequence searches, multiple sequence alignment and the building of a phylogenetic tree. Using data gathered from all these sources we hope to create a viable hypothesis for the overall function of YlqF proteins. In addition the prospect of YlqF’s involvement in 50S ribosomal assembly will be discussed.<br />
<br />
<br />
<br />
structure...</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Materials_and_Methods_5&diff=4889Materials and Methods 52007-06-11T10:00:25Z<p>ScottAllen: </p>
<hr />
<div>=Phylogeny=<br />
<br />
1PUJA FASTA sequence was obtained from NCBI Entrez Protein. This was used as a query to BLASTP the non-redundant protein database. Sequence matches were selected for multiple-sequence alignment (MSA) on the basis of presence in model organisms representative of the three major kingdoms of life (bacteria, archaea and eukaryotic) (Hedges, S. 2002). MSA was performed using CLUSTAL X (version 1.83), sequences were removed if they did not contain the N-terminal NKxD motif. N-terminal and C-terminal ends of sequences were trimmed downed so as only regions of conservation were aligned. The phylogenetic tree was constructed using the PROT-DIST (distance matrix construction), NEIGHBOUR (neighbour-joining method of tree construction), and CONSENSE programs of the PHYLIP package. Confidence in tree branches was determined by bootstrapping (100 resamplings of the PHYLIP analysis).<br />
<br />
<br />
=Structure=<br />
<br />
<br />
=Function=<br />
<br />
To determine the function of 1pujA a variety of data-mining strategies and tools were utilized. First a literature search was conducted using Google Scholar and PubMed to familiarize ourselves with our protein and to find any knowledge or additional information already in databases and published literature. From this point it we predicted a possible function for our protein and using proceeded to test our hypothesis. This was done using the ProKnow and the Profunc web servers. ProKnow is a relatively simple database which simply determines the likely molecular function and biological process of a sequence. ProFunc is a server which was developed to help identify the likely function of a protein from its three-dimensional structure. It uses both sequence and structure based methods to try to provide clues as to the protein's likely or possible function (REF= http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/). Of all the ProFunc searches Interpro, PDB, SSM and DALI and 3D functional template searches all returned useful data however the most promising results came from UniProt. UniProt is a tool for performing a multiple sequence alignment on a submitted sequence. It was also important to investigate the function of any orthologs of our protein. Mouse and human sequences were analysed with a server called SymAtlas, useful for determining the expression levels of a sequence in different tissues.<br />
<br />
<br />
To determine the function of 1pujA (and related YlqF proteins) a variety of data-mining strategies and computational tools were utilized. First a literature search was conducted using Google Scholar and PubMed to familiarize ourselves with YlqF proteins and to find any knowledge or additional information already in databases and published literature. From this point it we predicted a possible function for our protein and using this proceeded to test our hypothesis. This was done using the ProKnow and the Profunc web servers. ProKnow is a relatively simple database which determines the likely molecular function and biological process of a queried sequence. ProFunc is a server which was developed to help identify the likely function of a protein from its three-dimensional structure. It uses both sequence and structure based methods to try to provide evidence as to a protein's likely or possible function (REF= http://www.ebi.ac.uk/thornton-srv/databases/ProFunc/). UniProt is a tool for performing a multiple sequence alignment on a submitted sequence. It was also important to investigate the function of any suspected orthologs or homologs of our protein. Mouse and human sequences were analysed with a server called SymAtlas, useful for determining the expression levels of a sequence in different tissues.</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Introduction_5&diff=4882Introduction 52007-06-11T09:44:42Z<p>ScottAllen: </p>
<hr />
<div>==Introduction==<br />
<br />
Previous studies have located GTPases in a diverse array of bacteria and in all eukaryotes. GTPases are characterised by their use of GTP instead of ATP as a substrate. They are known to regulate many fundamental cellular processes such as translation, cell-signalling, intracellular trafficking and cytoskeletal re-organisation. The NKxD and Walker B motifs of GTPases specify the utilisation of GTP. 1pujA is a known Ylqf GTPase of ''Bacillus subtilis'' (Matsuo, Y et. al. 2006). ''B. Subtilis'' is a gram positive, catalase positive bacterium commonly found in soil. ''B. Subtilis'' has also been reffered to as ''Bacillus globigii'', ''Hay bacillus'' or ''Grass bacillus'' (Wiki 2007). YlqF has previously been associated with the assembly of the 50S ribosomal subunit. ''B. subtillus'' cells in which YlqF activity was inhibited showed slow growth rates and a build up of mis-folded 50S ribosomal subunits (Matsuo, Y et. al. 2006). A hypothesised circular permutation of the NKxD motif N-terminal of the Walker A motif (primary structure) is characteristic of GTPases of the Ylqf family (Leipe, D. et. al. 2002). Our aim for this study is to determine the overall function of YlqF proteins. To do this we will investigate the function using stratagies such as literature searches, function prediction programs and programs that utilize additional high-throughput functional data. The structure and evolution of the protein will also be investigate with hopes that this additional information will help us better understand the function of YlqF proteins. Structure will be investigated using web tools specific for structure comparison and structure analysis. Finally Evolution will be determined using sequence searches, multiple sequence alignment and the building of a phylogenetic tree. Using data gathered from all these sources we hope to create a viable hypothesis for the overall function of YlqF proteins. In addition the prospect of YlqF’s involvement in 50S ribosomal assembly will be discussed.<br />
<br />
<br />
<br />
structure...</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Introduction_5&diff=4878Introduction 52007-06-11T09:32:39Z<p>ScottAllen: </p>
<hr />
<div>==Introduction==<br />
<br />
Previous studies have located GTPases in a diverse array of bacteria and in all eukaryotes. GTPases are characterised by their use of GTP instead of ATP as a substrate. They are known to regulate many fundamental cellular processes such as translation, cell-signalling, intracellular trafficking and cytoskeletal re-organisation. The NKxD and Walker B motifs of GTPases specify the utilisation of GTP. 1pujA (YlqF) is a known GTPase of ''Bacillus subtilis'' (Matsuo, Y et. al. 2006). . ''B. Subtilis'' is a gram positive, catalase positive bacterium commonly found in soil. ''B. Subtilis'' has also been reffered to as ''Bacillus globigii'', ''Hay bacillus'' or ''Grass bacillus'' (Wiki 2007). YlqF has previously been associated with the assembly of the 50S ribosomal subunit. ''B. subtillus'' cells in which YlqF activity was inhibited showed slow growth rates and a build up of mis-folded 50S ribosomal subunits (Matsuo, Y et. al. 2006). A hypothesised circular permutation of the NKxD motif N-terminal of the Walker A motif (primary structure) is characteristic of GTPases of the Ylqf/YawG family (Leipe, D. et. al. 2002). Our aim for this study is to determine the overall function of YlqF. To do this we will directly investigate the function using stratagies such as literature searches, function prediction programs and programs that utilize additional high-throughput functional data. The structure and evolution of the protein will also be investigate with hopes that this additional information with help us better understand the function of YlqF. Structure will be investigated using web tools specific for structure comparison and structure analysis. Finally Evolution will be determined using sequence searches, multiple sequence alignment and the building of a phylogenetic tree. Using data gathered from all these searches we hope to create a viable hypothesis for the overall function of YlqF. In addition the prospect of YlqF’s involvement in 50S ribosomal assembly will be discussed.<br />
<br />
<br />
<br />
structure...</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Introduction_5&diff=4869Introduction 52007-06-11T09:19:19Z<p>ScottAllen: </p>
<hr />
<div>==Introduction==<br />
<br />
Previous studies have located GTPases in a diverse array of bacteria and in all eukaryotes. GTPases are characterised by their use of GTP instead of ATP as a substrate. They are known to regulate many fundamental cellular processes such as translation, cell-signalling, intracellular trafficking and cytoskeletal re-organisation. The NKxD and Walker B motifs of GTPases specify the utilisation of GTP. 1pujA (YlqF) is a known GTPase of ''Bacillus subtilis'' (Matsuo, Y et. al. 2006). . ''B. Subtilis'' is a gram positive, catalase positive bacterium commonly found in soil. ''B. Subtilis'' has also been reffered to as ''Bacillus globigii'', ''Hay bacillus'' or ''Grass bacillus'' (REFERENCE). YlqF has previously been associated with the assembly of the 50S ribosomal subunit. ''B. subtillus'' cells in which YlqF activity was inhibited showed slow growth rates and a build up of mis-folded 50S ribosomal subunits (Matsuo, Y et. al. 2006). A hypothesised circular permutation of the NKxD motif N-terminal of the Walker A motif (primary structure) is characteristic of GTPases of the Ylqf/YawG family (Leipe, D. et. al. 2002). Our aim for this study is to determine the overall function of 1pujA. To do this we will directly investigate the function using stratagies such as literature searches, function prediction programs and programs that utilize additional high-throughput functional data. The structure and evolution of the protein will also be investigate with hopes that this additional information with help us better understand the function of 1pujA. Structure will be investigated using web tools specific for structure comparison and structure analysis. Finally Evolution will be determined using sequence searches, multiple sequence alignment and the building of a phylogenetic tree. Using data gathered from all these searches we hope to create a viable hypothesis for the overall function of 1pujA. In addition the prospect of YlqF’s involvement in 50S ribosomal assembly will be discussed.<br />
<br />
<br />
<br />
structure...</div>ScottAllenhttp://compbio.biosci.uq.edu.au/mediawiki/index.php?title=Introduction_5&diff=4859Introduction 52007-06-11T09:12:48Z<p>ScottAllen: </p>
<hr />
<div>==Introduction==<br />
<br />
Previous studies have located GTPases in a diverse array of bacteria and in all eukaryotes. GTPases are characterised by their use of GTP instead of ATP as a substrate. They are known to regulate many fundamental cellular processes such as translation, cell-signalling, intracellular trafficking and cytoskeletal re-organisation. The NKxD and Walker B motifs of GTPases specify the utilisation of GTP. 1pujA (YlqF) is a known GTPase of ''Bacillus subtilis'' (Matsuo, Y et. al. 2006). . ''B. Subtilis'' is a gram positive, catalase positive bacterium commonly found in soil. ''B. Subtilis'' has also been reffered to as ''Bacillus globigii'', ''Hay bacillus'' or ''Grass bacillus'' (REFERENCE). YlqF has previously been associated with the assembly of the 50S ribosomal subunit. ''B. subtillus'' in which YlqF was inhibited showed slow growth and a build up of mis-folded 50S ribosomal subunits (Matsuo, Y et. al. 2006). A hypothesised circular permutation of the NKxD motif N-terminal of the Walker A motif (primary structure) is characteristic of GTPases of the Ylqf/YawG family (Leipe, D. et. al. 2002). Our aim for this study is to determine the overall function of 1pujA. To do this we will directly investigate the function using stratagies such as literature searches, function prediction programs and programs that utilize additional high-throughput functional data. The structure and evolution of the protein will also be investigate with hopes that this additional information with help us better understand the function of 1pujA. Structure will be investigated using web tools specific for structure comparison and structure analysis. Finally Evolution will be determined using sequence searches, multiple sequence alignment and the building of a phylogenetic tree. Using data gathered from all these searches we hope to create a viable hypothesis for the overall function of 1pujA. In addition the prospect of YlqF’s involvement in 50S ribosomal assembly will be discussed.<br />
<br />
<br />
<br />
structure...</div>ScottAllen