Materials and Methods - NANP
Methods and Materials
Query Sequence
Sequences of N-acetylneuraminic acid phosphatase from House Mouse (Mus musculus) were obtained from Genbank protein database, with Accession number of 2GFH_A.
Sequence Homology
The query sequence was matched to related (amino acid sequence similarity) proteins from Blast. This was done using a fixed database stored
within a DVD, instead of obtaining the query search from the actual BlastP database on the World Wide Web.
Multiple Sequence Alignment
Alignment was performed on all the related proteins (from the BlastP search), using ClustalX. Similarly, the ClutalX programme used for this
was obtained from the DVD, instead of the website.
Phylogenetic Tree
Phylip programme was used for the purpose of obtaining a phylogentic tree to determine the relationship of the proteins from individual
organisms. The various programmes used were again obtained from the DVD.Prodist (within Phylip) was used to calculate the distance matrix. The
calculation method selected was as using PAM-Dayhoff.Neighbor (also found within Phylip) was next used to form the phylogenetic tree, using the
distance matrix calculation obtained. The "Input order of species" option was set to "Random" when generating the tree, with a random odd
number also given.Treeview programme was used to view the final tree.
Bootstrapping
Seqboot (within Phylip) was used to replicate 100 samples of the sequence alignments.
The outfile (.aln) was then used in calculating the bootstrap distance matrices, using Prodist. The parameter setting for this calculation was
similar to the initial distance matrix calculation, using PAM-Dayhoff method. An added parameter was including multiple data sets, of 100
replicates.
This outfile (.dis) was run through Neighbor. The parameter settings were again similar to the previous generation of the earlier phylogenetic
tree. An added parameter, as was with the bootstrap distance matrix calculations, was the inclusion of multiple data sets of 100 replicates.
The treefile (.ph) was run through Consense (within Phylip) to obtain the final bootstrapped phylogenetic tree. Bootstrap branch values were
also obtained to determine the reliability of the tree branches.
Replacing organism identifiers on phylogenetic tree
An online World Wide Web programme — Kenegdo server, was used in converting organism identifiers from within the tree, to their species names.
Protein Folding
First DALI search was done to compare the 3D structure with those in the protein data bank. It revealed that Neu5Ac-9-P phosphatase is a
haloacid dehalogenase-like hydrolase. Searching the PDB was then done to source for the structures of biological macromolecules and their
relationships to sequence, function, and disease. CE which is a databases and tool for 3-D protein structure ccomparison and alignment was used
to compare the alignments between the query protein and its neigbhours.
Sequence Similarity
Interproscan was then used to analyze the newly determined sequences for annotation of predicted proteins from genome sequencing projects. In
order to further analyze the protein, Pfam which is a large collection of multiple sequence alignments and hidden Markov models is used to
analyze the protein in this case acetylneuraminic acid phosphatase to find Pfam family matches.
The aim of using the ProFunc server is to help identify the likely biochemical function of a protein from its three-dimensional structure. It
uses a series of methods, including fold matching, residue conservation, surface cleft analysis, and functional 3D templates, to identify both
the protein’s likely active site and possible homologues in the PDB.
Surface Properties
RasMol which is a molecular graphics program was used for the visualisation of proteins, nucleic acids and small molecules while PyMOL, a
molecular graphics system with an embedded Python interpreter designed for real-time visualization and rapid generation of high-quality
molecular graphics images and animations was performed to assist you in the research.