Discussion - NANP

From MDWiki
Jump to navigationJump to search


Multiple Sequence Alignment

From the MSA obtained, the organisms with the large gap insertions were isolated to be mainly Bacillus, with the exception

of Symbiobacterium thermophilum. Symbiobacterium is an uncultivable thermophile isolated from compost. Its survival is based mainly on

microbial commensalisms 5. This bacterium can only grow in vitro, if it is co-cultured with Bacillus species

bacteria 5. This could therefore explain its genetic association with Bacillus, as observed from the sequence

alignment. However, interestingly, Bacillus is classified as Gram-positive, while Symbiobacterium is a Gram-negative bacterium. As

observed from the sequence alignment, other Gram-negative bacterium protein sequences (Vibrio species) do not contain the large gap

insertion at the 91st to 114th amino acid positions, with the exception of Symbiobacterium. Hence, more genetic (and

even functional) analysis might be necessary to determine the hydrolase protein relationship between the Gram-positive Bacillus with the

Gram-negative Symbiobacterium.

Phylogenetic Tree

From the Rectangular Cladogram view of the tree, it was observed that there were two main Domains — Procaryotes and Eucaryotes. This would also

be the root and first branching point of the phylogenetic tree.

The invertebrates (of Phylum Arthropoda) would be the first branching point for the eucaryotes in this tree.

From there, further branching occurs into the vertebrates (of Phylum Chordata). This would then be further branched into Osteichthyes (bony

fish) and Tetrapoda (four-limbed vertebrates) Superclasses.

For the prokaryotic domain, mainly branching occurs between Gram-positive (Bacillus spp.) and Gram-negative (Vibrio spp.) bacteria.

Hence, it can be generally deduced that the Neu5Ac (hydrolase) protein is non-evolutionary specific, as it is observed to be present in almost

all main Phyla and Classes of organisms from the two main Procaryotic and Eucaryotic Domains. Its functional significance would therefore be a

general one.


Tree bootstrapping is necessary to test for the reliability of the branching patterns and distances formed on the phylogenetic tree. This was

done by making "pseudoreplicates" of multiple sequence alignments of up to 100 sets. The distance matrices were recalculated using these d

duplicate alignment values to generate a bootstrap tree, which can be used to compare the branching patterns and distances with the original

phylogenetic tree.The bootstrap values (in percentage) obtained on each branch, signify branching confidence. Bootstrap values of 95% equate to

full branching confidence; 75% value equates to 95% branching confidence; 60% value equates to much lowered branching confidence; while 50%

value would render no branching confidence.

Functional Analysis

Document17 01.png Document17 03.png Figure 17. (A) List of all matched protein name terms for 2gfh. (B) List of all matched Gene Ontology terms for 2gfh. The score in

red is a measure of how strongly the term is predicted from the hits obtained by the different methods. The scores in blue show each

method’s contribution to the total score (with the number of relevant sequences/structures shown in brackets in grey).


The predicted function based on the evolution and structure, illustrate that 2gfh is a hydrolase. Profunc searches (Figure 17) on 2gfh also

show that it possesses hydrolase activity. The highest score for Gene Ontology (Figure 17) states it used for metabolism and possesses

phosphoglycolate phosphatase activity. Hydrolyase is an enzyme which catalyzes hydrolysis reaction (Figure 18), which is the addition of the

hydrogen and hydroxyl ions of water to a molecule with its consequent splitting into two or more simpler molecules. Hydrolase is the systematic

name for any enzyme of EC class 3.

Document18 01.png

Figure 18. Hydrolyase catalyze the hydrosis of the chemical bond between A and B, resulting of 2 simple molecules.

Neu5Ac phosphatase belongs to the HAD family, HAD is a vast superfamily of largely uncharacterized enzymes, with a few members shown to possess

phosphatase, phosphoglucomutase, phosphonatase, and dehalogenase activities 6. HAD-like hydrolases represent the largest family of

predicted small molecule phosphatases encoded in the genomes of bacteria, archaea, and eukaryotes, with 6,805 proteins in data bases 7.

HADs share little overall sequence similarity (15–30% identity), but they can be identified by the presence of three short

conserved sequence motifs 7. Most of the characterized HADs have phosphatase activity (CO–P bond hydrolysis), catalyze dehalogenase

activity (C–halogen bond hydrolysis), phosphonatase (C–P bond hydrolysis), and phosphoglucomutase (CO–P bond hydrolysis and intramolecular

phosphoryl transfer) reactions 6.

In the study conducted by Maliekal et al (Figure 19), they compared the alignment of the first 280 amino acids of rat and human Neu5Ac-9-P

phostphatase with other 2 homologous sequences.

Document19 08.png

Figure 19. Alignment of rat and human Neu5Ac-9-P phosphatase with homologous sequences. The following sequences are aligned: Rattus

norvegicus (Rnor, gi-34859431), Homo sapiens (Hsap, gi-23308749), Xenopus laevis (Xlae, gi-46250196), Danio rerio (Drer, gi-

63101958), and Drosophila melanogaster (Dmel, gi-28381565). Only the first 280 residues of the latter sequence are shown. Completely

conserved residues are shown in boldface type. Asterisks indicate the extremely conserved residues in phosphatases of the HAD family 8.

The MSA done by Maliekal et al shows that the Neu5Ac-9-Pase orthologs shared the three motifs found in phosphatases of the HAD family,

namely a 1st motif comprising two extremely conserved aspartates (D), a 2nd motif comprising a conserved serine (S) or

threonine (T), and a 3rd motif comprising a conserved lysine (K) and two conserved aspartates (D) 8. The first aspartate

in the first motif forms a phosphoaspartate during the catalytic cycle 9. These findings suggested therefore that the HDHD4 protein

was a phosphatase. The first aspartate in the first motif forms a phosphoaspartate during the catalytic cycle 10. In our MSA (Figure

16), the several conserved motifs that shared great similarity to the study done by Maliekal et al. These findings suggested therefore that

Neu5Ac-9-P phosphatase protein is a phosphatase

Phosphatases of the HAD family are dependent on the presence of Mg2+ and Ca2+ inhibits

their activity by replacing Mg2+ and preventing the nucleophilic attack by the aspartate that covalently binds the

phosphate group 8. Phosphatases that form a phosphoenzyme during the catalytic cycle, are inhibited by vanadate 11.

Vanadate (VO43−), formed when V2O5 is dissolved in water at alkaline pH, appears to inhibit enzymes

that process phosphate.

The presence of a protein sharing at least about 50% sequence identity with rat or human Neu5Ac-9-P phosphatase in the genomes of mammals,

chicken, xenopus, and fishes indicates that sialic acid synthesis proceeds via the 9-phosphate intermediate in these species 8. This

is consistent with the finding that the genome of vertebrates comprises a gene encoding the bifunctional enzyme UDP- N-acetylglucosamine-2-

epimerase or N-acetylmannosamine kinase 8.

In bacteria, E. coli genome encodes five membrane-bound and 23 soluble HAD-like hydrolases, representing about 40% of the E. coli

proteins with known or predicted small molecule phosphatase activity 12. The metabo lites hydrolyzed by HADs are intermediates of

various metabolic pathways and reactions (glycolysis, pentose phosphate pathway, gluconeogenesis, and intermediary sugar and nucleotide


E. coli HADs hydrolyze a wide range of phosphorylated metabolites, including carbohydrates, nucleotides, organic acids, and coenzymes.

Studies have shown that the most common substrates in metabolism such as glycolysis and pentose phosphate pathway (Figure 18). These enzymes

were fructose-1-phosphate, glucose-6-phosphate, mannose-6-phosphate, 2-deoxyglucose-6-phosphate, fructose-6- phosphate, ribose-5-phosphate, and

erythrose- 4-phosphate 13.

Document20 01.png

Document20 02.png

Figure 20. The schematic diagrams of glycolysis and pentose phosphate metabolic pathways. The green arrows show the substrates that are hydrolyzed by HADs (A) Glycolysis pathway with substrates that are hydrolyze by HADs: glucose 6-phosphate, fructose 6-phosphate and dihydroxyacetone phosphate. (B) Pentose phostphate pathway with substrates that are hydrolyze by HADs: glucose-6-phosphate, fructose-6-phosphate, dihydroxyacetone phosphate, glyceraldehyde-3-phosphate, gluconate 6-phosphate and erythrose-4-phosphate.