ATP binding domain 4 Functions
ATP pyrophosphatase activity commonly have a widespread motif of ([A/S]-[F/y]-S-G-G-[L/V]-D-T-[S/T] which is commonly conserved in many groups of enzymes including GMP synthetases, argininosuccinate synthetases and ATP sulfurylases. (Lemke et al. 2001) If this PP-motif also presents in our uncharacterised protein, this may give us a clue to its functions.
Consensus of alignment between our sequence and other ATP pyrophosphatase members demonstrates that this protein belongs to anabolic pathways and have ATP pyrophosphatase activity, meaning that they will participate in hydrolysing the Alpha-Beta bond in ATP. Moreover, the respective overall reaction schemes are different among different family members (see figure 1.0).
For examples, according to the table below, in the reaction scheme of argininosuccinate synthetase (Assy), the cleavage of ATP will provide energy for changing L-citrullin and L-Asp into L-argininosuccinate. Whereas, in ATP sulfurylase, ATP bond breaking will drive the sulfate transfer to AMP. Therefore, although our protein may involve in ATP high energy bond cleavage, we are not sure what type of subunits will our protein associate with the energy-requiring function since the reaction schemes of different proteins are different.
Structural-based methods:Matching folds by sequence comparison (SSM)
The ProFunc server is used to identify similarities in structure and hopefully can indicate conserved functions. The SSM (Secondary Structure Matching) program is used to compare the fold of our query protein (Putative n-type ATP pyrophosphatase) against the folds of the structural database in PDB. The secondary structure elements (SSEs) of two proteins are compared and ranked by Z-scores, which is computed from the matching value obtained from structural superpositions.
The fold match results for our query protein (1ru8 chain A) returns with 10 structural matches as shown in figure 2.0.
The top match (1ru8A) is our query sequence and so does the second match. The 3rd hit (2d13A) is a hypothetical dimeric protein from a related species, Pyrococcus horikoshii expressed from E. Coli. The result returns with a high value of 15 matched secondary structure elements. (as shown by No.SSE column) These two proteins have a high similarities in terms of helices and strands. This result is also coherent with Dali search too. However, the function of this protein is also unidentified and therefore of little use to us.
The hits beyond 3, colour-coded with orange and blue, are rather weak matches and unlikely hits; which can agrees with the Dali search in structural section(Krissinel & Henrick 2004).
Profunc Nest analysis for structural motifs
Nests are structural motifs that are particularly important for its functions. These nests are usually hidden deep within clefts to stay away from hydrophilic environment. These structural motifs are usually parts of hydrogen bond and may form charge-charge interactions with anionic groups or ligands. Nests are identified by computer programmes based on their alternative enantiomeric mainchain dihedral angles from the alpha and gamma regions of a Ramachandran plot. (Pal et la.2002) The results are shown as follows.
High nest scores of nests 1-3 in figure 3.0 suggests that the first 3 sites are accessible to solvent, having significant conservative score compared to their parent residues or associated with those larger surface clefts. First 3 scores are all above 2.0 which are subjective of nests being relatively functionally significant.
Non-zero solvent accessibility: the percentage solvent accessible surface area of the residue's main chain nitrogen atom. As the first 3 residue ranges are non-zero then these atoms are accessible to solvent and capable of interacting with a binding anion, Leu4(A) in particular is very accessible to solvent.
Deeply clefted of nest 1 gives a strong indication of how deep in the cleft the NH atoms lie (Watson & Milner-White 2002). As Ser12(A) and Gly14(A) lie particularly deep in the largest cleft 1, it is a strong indicator that they are functionally important. Highly residue conserved in nests 1 and 2 has been determined from a multiple sequence alignment of the protein's sequence against BLAST hits from the UniProt sequence database. The conservation score of 1.0 in nests 1 and 2 indicates they are perfectly conserved.
The following result table demonstrates the gaps in the order of decreasing volumes. Clefts and cavities in our protein have been colour-coded according to the table below. In gap region 1, this biggest cleft contains 42 residue conservations (shown in red letters in ‘Residue conservation’ column. A TRS which is a common crystallization solvent ligand was shown to bind to gap 1.
According to the cleft analysis provided in ProFunc in figure 4.2, the largest gap has a volume of 9009.98 in its dimeric form (Krissinel & Henrick 2004). The volume determined by ProFunc is more than twice larger than the value previously determined by CASTp (3385.7) in the structural section. It is suspected that the cleft will be expanded when it is dimerized form. Identification of the residue type may be helpful for us to identify how ATP binding domain 4 interact with substrate and deduce the reaction mechanism. The majority of the residues are aliphatic (43) and neutral (34). Fewer residues are negatively charged (23) and positively charged (20).
Adenine nucleotide alpha hydrolases, AANH_like (cl00292), superfamily includes members of N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a alpha/beta/alpha fold which binds to Adenosine nucleotide. Our sequence of interest is a putative n-type ATP pyrophosphatase that are under AAHN-like superfamily.
Pfam from Sanger suggests that N-type ATP Pyrophosphatase belongs to a family named ATP-binding 4 (PF01902), which contains a 200 amino acids long strongly conserved motif of SGGKD near the N-terminus.
The family ATP-binding 4 is a member of clan PP-loop (CL0039), which is comprised of 9 members (figure 5.0): Arginossucinate synthase, Asn synthase, ATP binding 3, Exs B, NAD synthase, PAPS reductase, Thianmine biosynthesis protein, tRNA Methyl transferase and finally where our sequence belongs to -- ATP binding 4.Therefore, we try to investigate on related family members to see which one is a potential candidate for functional modelling. From our structural studies (see structual section), results of Dali returns with two family members with greatest motif similarities, namely Argininosuccinate synthetase (AS) and Queuosine biosynthesis protein. However, only AS has shown close alignment with ATP binding domain 4 but not the other. The Dali alignment is shown in figure 6.0 below.
Dali analysis of Domain Alignment
Use of Dali reveals similar domain alignment of our sequence with Argininosuccinate Synthase (2nz2-A), which Z-score is 11.0% (indicatively significant domain similarity). Therefore, we use Pymol to align conserved residues of 1RU8 and 2NZ2 (align 1RU8 & i. 11-16, 2NZ2 & i. 11-15). Close structure resemblance is a strong indication of function resemblance.
String Functional Partners Predictions
Green lines represent neighbourhood association. Blue lines represent coocurrence among different species. Weak association was shown in all hits. Our query protein (PF1758) as indicated by the red arrows is found immediately neighbour (within 300bp) with PF1760, PF1759 and glmS. This association indicates putative fusion events, which means that functional related genes are usually inherited together. Unfortunately, PF1760 and PF1759 are putative uncharacterised proteins. Whereas, glmS is a glucosamine (fructose-6-phosphate aminotransferase), its association with our query protein is only discovered between one proteobacteria, meaning an insignificant association. Even though the result of String is of little use to us, it is suggested that N-type ATP pyrophosphatases present in both bacteria and archea. (Gough et la.2001) This finding assists us in studying the evolutionary history of proteins.
Discussion on Function
From our findings, since both argininosuccinate synthetase (AS) and ATP binding domain 4 has striking similarities in terms of highly conserved PP-loop motif and cleft volume, it is possible to infer the mechanism of our protein based on the well-known AS reaction mechanism.
From the figure 8.0,
Step 1. Argininosuccinate synthetase releases inorganic pyrophosphate after the formation of activate citrulline-adenylate.
Step 2. Aspartate (the lone pair of N from amino group) undergoes nucleophilic attack on the carbonyl group on the activated citrulline-adenylate, hence forming argininosuccinate together with the release of AMP.
(1) Structure modelling with argininosuccinate synthetase (AS).
Our protein should be also involved in catalysing a substrate adenylation (means forming a phosphodiester bond between an amino acid and the phosphate group of AMP (adenosine monophosphate nucleotide) or simply the process of forming an adenylate, the salt or an ester of AMP, so as to activate a carbonyl or carboxyl group. (Lemke & Howell 2001) This activation is to facilitate the subsequent attack of a nitrogen-containing nucleophile from the substrate.
The structure of AS is used for the modelling ATP binding into our structure. The beta and gamma phosphate groups of ATP are oriented by characteristic residues of the PP-loop, and then forming a salt linkage between the g-phosphate and the N atom of lysine. (Lemke & Howell 2001)
In the AS model in E. coli., R168 (see figure 8.1) has been proposed for pyrophosphate binding. The carbonyl oxygen of the second residue preceding the PP motif in AS (Ala-16), falls within the hydrogen bonding distance of the O2’ hydroxyl oxygen of the ribose. (Lemke & Howell 2001). Therefore, the alanine residue closely after the PP-motif in our protein, Ala(20) should be involved in stabilising ribose by forming H-bond with its O2’ hydroxyl oxygen. This may explain why relatively majority of aliphatic residues are conserved in the biggest gap in our cleft analysis (see figure 4.2 under function analysis) and relatively highly conserved detected by PDBsum.
It was proposed that the amide oxygen of a glutamine (Q46) should be involved in interacting with the N6 nitrogen of adenine. Relatively highly conserved Q104 in our structure (detected by PDBsum) might also be involved in similar interaction.
The most intriguing part is in the highly conserved PP-loop motif ([A/S]-[F/Y]-S-G-G-[L/V]-D-T-[S/T]) contains two absolutely conserved glycine residues, Gly(13) and Gly(14). In AS model, the two conserved glycine residues showed significantly different conformations in its uncomplexed and complexed structures, which are suspected to play a role in binding and release of pyrophosphate (PPi). (Lemke & Howell 2001) This agrees with glycine, lacking side chains, can provide a high degree of flexibility. Therefore, Gly(13)Gly(14) in ATP binding domain 4 should provide a large anion hole required for pyrophosphate binding. Lemke and Howell (2001) proposed that other N-type ATP pyrophosphatases models, if having glycine residues replaced, the steric hindrance created will crash with the bridge oxygen of the bound pyrophosphate.
The final residue of the PP motif in our structure, the hydroxyl group of Ser(17) should be involved in forming hydrogen bonds with the highly conserved residues, Gly(14) and Arg(60), meaning Ser(17) residue is important both for pyrophosphate binding and structural support by interacting with a helices, therefore H1 and H2 can be linked together. (see Figure 2.0 showing H1 and H2 in PDBsum)
(2) Protein conformational change existed in the catalytic cycle.
The relative positions of all 3 substrates in AS suggest a strong requirement for a conformational change during catalysis. Our analysis of AS with substrates interactions indicates that PP-loop closely binds with ATP and is distantly related to citrulline (the substrate). This implies that as for our protein, only PP-loop is directly interacting with ATP, and its substrate may interact with ATP in the opposite end.
(3) Only domain A was involved in ATP-substrate binding; domain B suggested functions otherwise.
Domain B in our protein is poorly aligned with AS but closely resembled only in PP-loop-containing domain A region, which has been previously confirmed in our PyMOL alignment. It suggested that domain B of ATP binding domain 4 may be involved in substrates recognition which explains the subtle differences between our protein with AS. This hypothesis was based on our findings that the starting residues of domain B formed a hairpin-like structure that pointed to the core of ATP binding domain 4 that might suggest an intermolecular cooperation.
Future research should be focused on developing the crystallography of ATP binding domain 4 binding with its substrates. Attempts of building a cocrystallized structure with ATP and enzyme is often hard. Taking Lemke and Howell (2001)’s study as an example, cocrystallizing ATP with AS from E. coli. is detrimental which results into a poorly diffracting, easily cracked and dissolving crystals. Therefore, our study focuses only on inferring structure based on other N-type ATP pyrophosphatases. However, building a crystallised model will definitely provide us with insight to the enzymatic mechanism.
More structural and functional deduction experiments should be done based on site-directed mutagenesis on PP-loop and potential substrate-interacting residues in the groove identified in our structural analysis. More studies should be done based on elucidating domain B of ATP binding domain 4 which may be important for substrate selectivity.
Further research can be done on comparing the ATP binding domain 4 between human and Pyrococcus furiosus and look for any mutations of the structurally and catalytically important residues suggested in this study are associated with any ATP-related genetic diseases.