2gnx Results: Difference between revisions

From MDWiki
Jump to navigationJump to search
No edit summary
 
(144 intermediate revisions by 4 users not shown)
Line 1: Line 1:
==Evolutionary analysis==
[[Image:alignment.jpg|framed|'''Figure 1'''<BR>This image shows part of a complete alignment of the sequences used. Asterisks (*) indicate residues that are conserved across all sequences, and colons (:) indicate partial conservation across all sequences.|none]]<BR>
[[Image:tree1.jpg|framed|'''Figure 2'''<BR>The phylogenetic tree shows how close the relationships between the sequences are. The longer the branches of the tree the more evolutionary divergent the sequences are. 2GNX A is the original protein being investigated and was a mouse protein. The branches with marked with * indicate that this branch arrangement occured more then 75% of the time.|none]]<BR>
==Structural analysis==
==Structural analysis==


A Dali analysis (Table 1) of our protein was highly inconclusive and there were no significant structural matches to our hypothetical protein.
An analysis of the secondary structure of the protein from its amino acid sequence (Figure 3) shows the secondary structural arrangement of different regions of our protein


===Table 1: A Dali analysis of the 2GNX protein===
[[Image:Mel's_picture_of_secondary_str..jpg|framed|'''Figure 3'''<BR>Secondary structure analysis of the 2GNX protein from Protein Data Bank.|none]]<BR>


'''Table 1 ''' Dali analysis of the 2GNX protein
  NR. STRID1 STRID2  Z  RMSD LALI LSEQ2 %IDE REVERS PERMUT NFRAG TOPO PROTEIN
  NR. STRID1 STRID2  Z  RMSD LALI LSEQ2 %IDE REVERS PERMUT NFRAG TOPO PROTEIN
   1: 3023-A 2gnx-A 42.9  0.0  280  280  100      0      0    1 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
   1: 3023-A 2gnx-A 42.9  0.0  280  280  100      0      0    1 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
Line 21: Line 28:
  14: 3023-A 2c5i-T  4.5  2.8  75    93  11      0      0    5 S    PROTEIN TRANSPORT/COMPLEX t-snare affecting a late gol
  14: 3023-A 2c5i-T  4.5  2.8  75    93  11      0      0    5 S    PROTEIN TRANSPORT/COMPLEX t-snare affecting a late gol
  15: 3023-A 3nul    4.4  3.4  93  130    5      0      0    11 S    ACTIN-BINDING PROTEIN profilin i (arabidopsis thalian
  15: 3023-A 3nul    4.4  3.4  93  130    5      0      0    11 S    ACTIN-BINDING PROTEIN profilin i (arabidopsis thalian
16: 3023-A 2h7o-A  4.4  3.0  81  270    5      0      0    7 S    SIGNALING PROTEIN protein kinase ypka fragment (protei
17: 3023-A 1mc0-A  4.4 13.3  144  341    9      0      0    17 S    HYDROLASE 3',5'-cyclic nucleotide phosphodiesterase 2a
18: 3023-A 2ijp-A  4.3  4.2  117  217  13      0      0    12 S    SIGNALING PROTEIN 14-3-3 protein (cryptosporidium par
19: 3023-A 2h7v-C  4.3  4.2  76  269  13      0      0    5 S    SIGNALING PROTEIN migration-inducing protein 5 (ras-re
20: 3023-A 2cbi-A  4.3  4.3  123  584    7      0      0    13 S    HYDROLASE hyaluronidase fragment expression_system_ve
21: 3023-A 1hg5-A  4.3  3.2  85  263    9      0      0    6 S    ENDOCYTOSIS clathrin assembly protein short form frag
22: 3023-A 2dnx-A  4.2  4.9  80  130    6      0      0    6 S    TRANSPORT PROTEIN syntaxin-12 fragment (homo sapiens)
23: 3023-A 1a17    4.2  5.0  85  159    4      0      0    8 S    HYDROLASE serineTHREONINE PROTEIN PHOSPHATASE 5 fragme
24: 3023-A 2if4-A  4.1  2.5  82  258    7      0      0    7 S    SIGNALING PROTEIN atfkbp42 fragment (twd1 (twisted dwa
25: 3023-A 1owa-A  4.0  3.3  76  156  12      0      0    6 S    CYTOKINE spectrin alpha chain, erythrocyte fragment (e
26: 3023-A 1br0-A  4.0  3.5  79  120  13      0      0    6 S    MEMBRANE PROTEIN syntaxin 1-a fragment Mutant (rattus
27: 3023-A 1xdo-A  3.9  3.5  83  686  10      0      0    8 S    TRANSFERASE polyphosphate kinase (ppk, polyphosphoric
28: 3023-A 1vls    3.9  3.6  77  146  10      0      0    7 S    CHEMOTAXIS aspartate receptor (tar) biological_unit (
29: 3023-A 1sjj-A  3.9  7.9  114  863    4      0      0    14 S    CONTRACTILE PROTEIN actinin (gallus gallus) chicken
30: 3023-A 2iak-A  3.8  2.4  68  197    6      0      0    5 S    CELL ADHESION bullous pemphigoid antigen 1, isoform 5
31: 3023-A 2dl1-A  3.8  4.1  86  116    3      0      0    7 S    PROTEIN TRANSPORT spartin fragment (trans-activated by
32: 3023-A 2h28-A  3.7  2.8  75  106    8      0      0    10 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
33: 3023-A 2cwo-A  3.7  3.3  83  165    5      0      0    10 S    RNA BINDING PROTEIN RNA silencing suppressor (p21) (b
34: 3023-A 2hj9-C  3.6  3.4  80  210    5      0      0    10 S    SIGNALING PROTEIN autoinducer 2-binding periplasmic pr
35: 3023-A 2gsc-A  3.6  3.7  74  113  12      0      0    5 S    UNKNOWN FUNCTION conserved hypothetical protein (conse
36: 3023-A 2cpt-A  3.6  4.0  78  117  12      0      0    5 S    PROTEIN TRANSPORT vacuolar sorting protein 4b fragment
37: 3023-A 1v9d-A  3.6  4.4  83  308    5      0      0    6 S    PROTEIN BINDING diaphanous protein homolog 1 fragment
38: 3023-A 2i0m-A  3.5  4.5  113  207  10      0      0    13 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION phosphate transp
39: 3023-A 2hje-A  3.5  3.2  84  210    7      0      0    11 S    SIGNALING PROTEIN autoinducer 2 sensor kinasePHOSPHATA
40: 3023-A 2dmw-A  3.5  3.3  85  131    7      0      0    11 S    MEMBRANE PROTEIN synaptobrevin-like 1 variant fragment
41: 3023-A 2bkp-A  3.5  3.3  84  193    7      0      0    8 S    HYPOTHETICAL PROTEIN hypothetical protein ph0236 (pyr
42: 3023-A 1zu2-A  3.5  5.2  73  158    5      0      0    6 S    TRANSPORT PROTEIN mitochondrial import receptor subuni
43: 3023-A 1y79-1  3.5  3.1  93  680    4      0      0    10 S    HYDROLASE peptidyl-dipeptidase dcp (dipeptidyl carboxy
44: 3023-A 2fup-A  3.4  2.9  80  126    9      0      0    7 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
45: 3023-A 2iub-A  3.3  4.7  79  331  11      0      0    5 S    MEMBRANE PROTEIN divalent cation transport-related pro
46: 3023-A 2e9x-D  3.3  2.5  66  197    3      0      0    6 S    REPLICATION DNA replication complex gins protein psf1
47: 3023-A 2bkn-A  3.3  3.3  80  190    6      0      0    5 S    MEMBRANE PROTEIN hypothetical protein ph0236 (pyrococ
48: 3023-A 2avx-A  3.3  3.7  94  171    5      0      0    10 S    TRANSCRIPTION regulatory protein sdia Mutant (escheri
49: 3023-A 1u89-A  3.3  3.1  77  139    4      0      0    6 S    STRUCTURAL PROTEIN talin 1 fragment (talin) (mus musc
50: 3023-A 1qqt-A  3.3  3.4  84  546    7      0      0    6 S    LIGASE methionyl-trna synthetase fragment (escherichi
51: 3023-A 1qqe-A  3.3  7.4  94  281    6      0      0    10 S    PROTEIN TRANSPORT vesicular transport protein sec17 Mu
52: 3023-A 2j3t-C  3.2  5.2  83  141    7      0      0    8 S    PROTEIN TRANSPORT trafficking protein particle complex
53: 3023-A 1uur-A  3.2  3.2  84  460    7      0      0    7 S    SIGNAL TRANSDUCTION stat protein fragment (dictyostel
54: 3023-A 1tkn-A  3.2  3.8  75  110  13      0      0    5 S    MEMBRANE PROTEIN amyloid beta a4 protein (homo sapien
55: 3023-A 1qja-A  3.2  3.0  75  217    5      0      0    8 S    COMPLEX (SIGNAL TRANSDUCTION/PEPTIDE) 14-3-3 protein z
56: 3023-A 1ek8-A  3.2  5.6  78  185    9      0      0    5 S    TRANSLATION ribosome recycling factor (ribosome relea
57: 3023-A 2uv0-E  3.1  3.6  94  159    9      0      0    12 S    TRANSCRIPTION transcriptional activator protein lasr (
58: 3023-A 2nty-A  3.1  2.6  93  332    4      0      0    9 S    SIGNALING PROTEIN emb|cab41934.1 fragment (prone8) rac
59: 3023-A 2hi7-B  3.1  2.6  77  134    5      0      0    8 S    OXIDOREDUCTASE thiol:disulfide interchange protein dsb
60: 3023-A 2fez-A  3.1  3.6  104  373  10      0      0    12 S    TRANSCRIPTION probable regulatory protein embr (mycob
61: 3023-A 2a73-B  3.1  3.2  91  976  11      0      0    13 S    IMMUNE SYSTEM complement c3 fragment complement c3 fra
62: 3023-A 1uw4-B  3.1  3.3  86  247  12      0      0    12 S    NONSENSE MEDIATED MRNA DECAY PROTEIN regulator of nons
63: 3023-A 1hx1-B  3.1  3.2  76  112  13      0      0    6 S    CHAPERONE/CHAPERONE INHIBITOR heat shock cognate 71 k
64: 3023-A 1fpo-A  3.1  3.4  72  171    8      0      0    4 S    CHAPERONE chaperone protein hscb (hsc20) Mutant (esc
65: 3023-A 2o8p-A  3.0  5.0  104  215  13      0      0    11 S    SIGNALING PROTEIN 14-3-3 domain containing protein (c
66: 3023-A 2d9d-A  3.0  3.8  80    89  13      0      0    8 S    CHAPERONE bag family molecular chaperone regulator 5 f
67: 3023-A 2ak6-A  3.0  5.7  93  153    8      0      0    11 S   
68: 3023-A 1ya0-A  3.0  4.3  81  458    6      0      0    8 S    SIGNALING PROTEIN smg-7 transcript variant 2 fragment
69: 3023-A 1x8z-A  3.0  3.2  75  151    9      0      0    3 S    PROTEIN BINDING invertasePECTIN METHYLESTERASE INHIBIT
70: 3023-A 1qoy-A  3.0  7.5  121  303  14      0      0    9 S    TOXIN hemolysin e (cytolysin a, silent hemolysin a, hl
71: 3023-A 1qgr-A  3.0  2.7  79  871    5      0      0    7 S    TRANSPORT RECEPTOR importin beta subunit (karyopherin
72: 3023-A 1ocr-C  3.0  3.8  94  261    7      0      0    9 S    OXIDOREDUCTASE cytochrome c oxidase (ferrocytochrome c
73: 3023-A 1fce    3.0  2.5  60  629  10      0      0    5 S    CELLULASE DEGRADATION cellulase celf fragment (clostr
74: 3023-A 2nwb-A  2.9  3.8  88  379    6      0      0    10 S    OXIDOREDUCTASE conserved domain protein (putative 2,3-
75: 3023-A 2i6h-A  2.9  3.8  92  176  10      0      0    12 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
76: 3023-A 1w5d-A  2.9  3.7  91  458    7      0      0    10 S    HYDROLASE penicillin-binding protein (bacillus subtil
77: 3023-A 1sj7-A  2.9  5.5  77  167  10      0      0    5 S    STRUCTURAL PROTEIN talin 1 fragment (mus musculus) mo
78: 3023-A 1ojg-A  2.9  3.5  83  135    8      0      0    8 S    TRANSFERASE sensor protein dcus (dcus) fragment (esch
79: 3023-A 1i1i-P  2.9  3.6  100  665    7      0      0    10 S    HYDROLASE neurolysin (rattus norvegicus) rat express
80: 3023-A 1dov-A  2.9  4.9  76  181    9      0      0    6 S    CELL ADHESION alpha-catenin fragment (mus musculus) m
81: 3023-A 2grg-A  2.8  3.2  68    98  10      0      0    8 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
82: 3023-A 2bvl-A  2.8  3.6  90  543  14      0      0    9 S    TOXIN toxin b fragment Mutant (clostridium difficile)
83: 3023-A 1z0p-A  2.8  1.6  51    73    4      0      0    2 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
84: 3023-A 1u6g-C  2.8  5.3  79  1146    6      0      0    9 S    LIGASE cullin homolog 1 (cul-1) ring-box protein 1 (rb
85: 3023-A 1oj5-A  2.8  2.8  68  105    3      0      0    7 S    TRANSCRIPTIONAL COACTIVATOR steroid receptor coactivat
86: 3023-A 1ecr-A  2.8  2.8  65  305    6      0      0    7 S    COMPLEX (DNA-BINDING PROTEIN/DNA) replication terminat
87: 3023-A 1e25-A  2.8  4.6  92  278    7      0      0    9 S    HYDROLASE extended-spectrum beta-lactamase per-1 Muta
88: 3023-A 1di1-A  2.8  4.4  72  290    4      0      0    5 S    LYASE aristolochene synthase (sesquiterpene cyclase,
89: 3023-A 2p0n-A  2.7  3.1  80  157  10      0      0    7 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
90: 3023-A 2iiu-A  2.7  3.8  77  203    8      0      0    7 S    STRUCTURAL GENOMICS/UNKNOWN FUNCTION hypothetical prot
91: 3023-A 2gom-A  2.7  2.4  54    61    6      0      0    5 S    CELL ADHESION/TOXIN fibrinogen-binding protein fragmen
92: 3023-A 1z1w-A  2.7  4.3  98  780    5      0      0    12 S    HYDROLASE tricorn protease interacting factor f3 (the
93: 3023-A 1h3n-A  2.7  3.6  76  813  14      0      0    6 S    AMINOACYL-TRNA SYNTHETASE leucyl-trna synthetase (the
94: 3023-A 1bk5-A  2.7  3.0  71  422    7      0      0    8 S    PROTEIN TRANSPORT karyopherin alpha fragment (importin
95: 3023-A 1aqt    2.7  1.7  47  135    0      0      0    2 S    HYDROLASE atp synthase fragment Mutant (escherichia c
96: 3023-A 2nw9-A  2.6  6.4  102  265    9      0      0    11 S    OXIDOREDUCTASE tryptophan 2,3-dioxygenase (xanthomona
97: 3023-A 2iml-A  2.6  2.5  59  188    8      0      0    4 S    FLAVOPROTEIN hypothetical protein (archaeoglobus fulg
98: 3023-A 2e9x-A  2.6  1.3  53  144    6      0      0    2 S    REPLICATION DNA replication complex gins protein psf1
99: 3023-A 2d4x-A  2.6  4.3  94  214  12      0      0    10 S    STRUCTURAL PROTEIN flagellar hook-associated protein 3


A Dali analysis carried out separately with only the N-terminus (Table 2) of our protein also did not produce any significant structural matches
A Dali analysis (Table 1) of the 2GNX protein was highly inconclusive and there were no significant structural matches to the hypothetical protein.


===Table 2: Dali analysis of N-terminal domain===
'''Table 2 ''' Dali analysis of N-terminal domain
   NR. STRID1 STRID2  Z  RMSD LALI LSEQ2 %IDE REVERS PERMUT NFRAG TOPO PROTEIN
   NR. STRID1 STRID2  Z  RMSD LALI LSEQ2 %IDE REVERS PERMUT NFRAG TOPO PROTEIN
   1: 3256-A 2gnx-A 23.2  0.0  173  280  100      0      0    1 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
   1: 3256-A 2gnx-A 23.2  0.0  173  280  100      0      0    1 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
Line 125: Line 48:
   14: 3256-A 1owa-A  6.2  3.3  76  156  12      0      0    6 S    CYTOKINE spectrin alpha chain, erythrocyte fragment (e
   14: 3256-A 1owa-A  6.2  3.3  76  156  12      0      0    6 S    CYTOKINE spectrin alpha chain, erythrocyte fragment (e
   15: 3256-A 2oew-A  6.1  2.8  119  358    8      0      0    12 S    PROTEIN TRANSPORT programmed cell death 6-interacting  
   15: 3256-A 2oew-A  6.1  2.8  119  358    8      0      0    12 S    PROTEIN TRANSPORT programmed cell death 6-interacting  
  16: 3256-A 1xdo-A 6.3.5  83   686  10     0      0    8 S    TRANSFERASE polyphosphate kinase (ppk, polyphosphoric
 
  17: 3256-A 1vls    6.1 3.6   77   146  10     0      0     7 S    CHEMOTAXIS aspartate receptor (tar) biological_unit (
A Dali analysis carried out separately with only the N-terminal domain (Table 2) of the protein also did not produce any significant structural matches.
  18: 3256-A 1br0-A  6.1 3.5   79   120   13      0      0    6 S    MEMBRANE PROTEIN syntaxin 1-a fragment Mutant (rattus
 
  19: 3256-A 2oev-A 6.0 29.3  112   697   6     0      0    10 S    PROTEIN TRANSPORT programmed cell death 6-interacting
[[Image:2CMR 2GNX.png|framed|'''Figure 4'''<BR>2CMR-2GNX alignment (2CMR displayed in cyans and 2GNX displayed in green).|none]]<BR>
  20: 3256-A 2cwy-A  6.0 2.4   82    92  20     0      0     7 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
 
  21: 3256-A 1sjj-A  6.0 4.6   88   863   5      0      0     9 S    CONTRACTILE PROTEIN actinin (gallus gallus) chicken
A CE alignment between IMMUNOGLOBULIN COMPLEX d5 (2CMR) and 2GNX was performed (Figure 4). The result revealed that the N-terminus of 2GNX matched 2CMR:A which was a TRANSMEMBRANE GLYCOPROTEIN, with Rmsd = 3.8Å and Z-Score = 3.7. The 3D figure showed that two proteins both had five-helix strucuture and they were well fitted. However, the function of this 5-helix stucture was not clear.
  22: 3256-A 2iak-A  5.9 2.4   68   197   6     0      0     5 S    CELL ADHESION bullous pemphigoid antigen 1, isoform 5  
 
  23: 3256-A 2dl1-A  5.9 4.1   86   116   3     0      0     7 S    PROTEIN TRANSPORT spartin fragment (trans-activated by
'''Table 3: Dali analysis of C-terminal domain'''
  24: 3256-A 2cwo-A  5.8 3.3   83   165    5     0      0    10 S    RNA BINDING PROTEIN RNA silencing suppressor (p21) (b
NR. STRID1 STRID2  Z   RMSD LALI LSEQ2 %IDE REVERS PERMUT NFRAG TOPO PROTEIN
   25: 3256-A 2cbi-A  5.7 4.3 123   584   7      0      0    13 S    HYDROLASE hyaluronidase fragment expression_system_ve
  1: 3257-A 2gnx-A 24.3  0.0  118   280  100     0      0    1 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
   26: 3256-A 2gsc-A  5.6 3.7   74   113  12     0      0     5 S    UNKNOWN FUNCTION conserved hypothetical protein (conse
  2: 3257-A 1jmr-A  7.6  3.0   94   246    9     0      0   12 S     
   27: 3256-A 2cpt-A 5.6 4.0   78   117  12     0      0    5 S    PROTEIN TRANSPORT vacuolar sorting protein 4b fragment
  3: 3257-A 1j3w-A  7.5 2.9   91   134   13      0      0    7 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION giding protein-m
   28: 3256-A 1v9d-A 5.6 4.4   83   308   5      0      0    6 S    PROTEIN BINDING diaphanous protein homolog 1 fragment
  4: 3257-A 1f5m-B 6.8  2.9  95   177   9     0      0    10 S    SIGNALING PROTEIN gaf (saccharomyces cerevisiae) yeas
   29: 3256-A 1zu2-A  5.5 5.2   73   158   5      0      0    6 S    TRANSPORT PROTEIN mitochondrial import receptor subuni
  5: 3257-A 1h3q-A  6.6 4.2  92  140    4     0      0   11 S    TRANSPORT sedlin (sedl) (mus musculus) mouse S.B.Jan
   30: 3256-A 2ijp-A 5.4 4.1  113   217   12     0      0    11 S    SIGNALING PROTEIN 14-3-3 protein (cryptosporidium par
  6: 3257-A 3nul    6.3 3.93   130   5      0      0   11 S    ACTIN-BINDING PROTEIN profilin i (arabidopsis thalian
  31: 3256-A 2fup-A  5.4  2.9   80   126    9      0      0    7 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
  7: 3257-A 1mc0-A  5.8 4.1   99   341   8     0      0   11 S    HYDROLASE 3',5'-cyclic nucleotide phosphodiesterase 2a
   32: 3256-A 1y79-1 5.4  3.0  91  680    4      0      0    9 S    HYDROLASE peptidyl-dipeptidase dcp (dipeptidyl carboxy
  8: 3257-A 2h28-A  5.4 2.8   75   106   8     0      0   10 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
   33: 3256-A 2iub-5.3  4.7   79  331  11      0      0    5 S    MEMBRANE PROTEIN divalent cation transport-related pro
  9: 3257-A 2p7j-A  5.0 2.9  79   262   13     0      0    11 S    TRANSCRIPTION putative sensory boxGGDEF FAMILY PROTEIN
  34: 3256-A 2e9x-D  5.3  2.5  66  197    3      0      0    6 S    REPLICATION DNA replication complex gins protein psf1
   10: 3257-A 2dmw-A  5.0 3.3  85  131   7      0      0    11 S    MEMBRANE PROTEIN synaptobrevin-like 1 variant fragment
  35: 3256-A 1qqe-A  5.3  3.1   77   281    8      0      0    7 S    PROTEIN TRANSPORT vesicular transport protein sec17 Mu
   11: 3257-A 2avx-A  4.8 3.6   93   171    5     0      0   10 S    TRANSCRIPTION regulatory protein sdia Mutant (escheri
   36: 3256-A 1u89-5.2  3.1   77  139    4      0      0    6 S    STRUCTURAL PROTEIN talin 1 fragment (talin) (mus musc
   12: 3257-A 2j3t-C 4.7 5.2   83   141    7     0      0    8 S    PROTEIN TRANSPORT trafficking protein particle complex
  37: 3256-A 1tkn-A  5.1 3.8   75   110   13      0      0    5 S    MEMBRANE PROTEIN amyloid beta a4 protein (homo sapien
   13: 3257-A 2hj9-C 4.7 3.3   76   210   5      0      0    9 S    SIGNALING PROTEIN autoinducer 2-binding periplasmic pr
  38: 3256-A 1ek8-A  5.1 5.6  78  185    9      0      0    5 S    TRANSLATION ribosome recycling factor (ribosome relea
   14: 3257-A 2hje-A  4.6 3.0   75   210   5      0      0    9 S    SIGNALING PROTEIN autoinducer 2 sensor kinasePHOSPHATA
  39: 3256-A 2a73-B  5.0  3.2  91  976  11      0      0    13 S    IMMUNE SYSTEM complement c3 fragment complement c3 fra
   15: 3257-A 2uv0-E 4.3.5   93   159    9     0      0    12 S    TRANSCRIPTION transcriptional activator protein lasr
  40: 3256-A 1uur-A  5.0  3.1  82  460    7     0      0    6 S    SIGNAL TRANSDUCTION stat protein fragment (dictyostel
 
  41: 3256-A 1qja-A  5.0  3.0  75  217    5      0      0    8 S    COMPLEX (SIGNAL TRANSDUCTION/PEPTIDE) 14-3-3 protein z
However, a Dali analysis (Table 3) carried out with the C-terminal domain of the protein produced one significant structural match, this being the GAF signalling protein, i.e the 4th result in the Dali analysis.
  42: 3256-A 1hx1-B  5.0 3.2  76  112  13      0      0     6 S    CHAPERONE/CHAPERONE INHIBITOR heat shock cognate 71 k
 
  43: 3256-A 1fpo-A  5.0 3.4  72  171    8      0     0     4 S    CHAPERONE chaperone protein hscb (hsc20) Mutant (esc
 
  44: 3256-A 2bkp-A  4.9  2.8  76  193    7      0      0    6 S    HYPOTHETICAL PROTEIN hypothetical protein ph0236 (pyr
[[Image:Image-Dotlet.PNG|framed|'''Figure 5'''<BR>Dotlet analysis for 2GNX.|none]]<BR>
  45: 3256-A 1x8z-A  4.9  3.2   75  151    9      0     0     3 S    PROTEIN BINDING invertasePECTIN METHYLESTERASE INHIBIT
 
  46: 3256-A 2bkn-A  4.3.0   74  190    7      0     0     4 S    MEMBRANE PROTEIN hypothetical protein ph0236 (pyrococ
The Dotlet analysis (Figure 5) showed that there was no internally homologous repeats in the C-terminus of 2GNX.
  47: 3256-A 2ak6-A  4.8  3.7  71  153    7      0      0     7 S   
 
  48: 3256-A 1ya0-A  4.8  4.3  81  458    6      0     0     8 S    SIGNALING PROTEIN smg-7 transcript variant 2 fragment
 
  49: 3256-A 1uw4-B  4.8  3.2  84  247  11      0     0   11 S    NONSENSE MEDIATED MRNA DECAY PROTEIN regulator of nons
   USR1:A  185/392   QVAKNLFTH---LDDVSVLLQEIITEARNLSNAEICSVFLLDQ-----------------
  50: 3256-A 1qgr-A  4.8  2.7  79  871    5      0      0     7 S    TRANSPORT RECEPTOR importin beta subunit (karyopherin
   USR2:A  181/283  TASEXKALTAKANPDLFGKISSFIRKY------DAANVSLIFDNRGSESFQGHGYHHPHS
  51: 3256-A 1fce    4.2.5  60  629  10      0      0     5 S    CELLULASE DEGRADATION cellulase celf fragment (clostr
   USR1:A  225/432   ----------NELVAKVFDGGVVDDESYEIRIPADQGIAGHVATTG----------QILN
  52: 3256-A 1i1i-P  4.7  3.2  95  665    7      0     0     9 S    HYDROLASE neurolysin (rattus norvegicus) rat express
  USR2:235/#44   YREAPKGVDQYPAVVSLP----------SDRPVXHWPNVIXIXTDRASDLNSLEKVVHFY
  53: 3256-A 1dov-A  4.7  4.9  76  181    9      0     0     6 S    CELL ADHESION alpha-catenin fragment (mus musculus) m
   USR1:A  265/472  IPDAYAHPLFYRGVDDSTGFRTRNILCFPIKNENQEVIGVAELVNKINGPWFSKFDEDLA
  54: 3256-A 1z0p-A  4.1.6  51    73    4      0     0     2 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
   USR2:A  285/387   DDKV-------------------QSTYFLTRPEP-HFTIVVIFESK---------KSERD
  55: 3256-A 1sj7-A  4.6  5.5  77  167  10      0      0    5 S    STRUCTURAL PROTEIN talin 1 fragment (mus musculus) mo
  USR1:A 325/532   TAFSIYCGISIAHSLL
  56: 3256-A 1ecr-A 4.6  2.8  65  305    6      0      0    7 S    COMPLEX (DNA-BINDING PROTEIN/DNA) replication terminat
   USR2:A  316/418   SHFISFLNELSLALKN
  57: 3256-A 2bvl-A  4.5  3.6  90  543  14      0      0    9 S    TOXIN toxin b fragment Mutant (clostridium difficile)
 
  58: 3256-A 1u6g-C  4.5  4.4  75  1146    7      0      0    9 S    LIGASE cullin homolog 1 (cul-1) ring-box protein 1 (rb
Figure 6: CE predicted structural alignment. USR1 = 1MC0(PDB code), Regulatory Segment of Mouse 3',5'-Cyclic Nucleotide Phosphodiesterase 2A, Containing the GAF A and GAF B DomainsUSR2= 2GNX
  59: 3256-A 1di1-A  4.5  4.4  72  290    4      0      0    5 S    LYASE aristolochene synthase (sesquiterpene cyclase,
 
  60: 3256-A 1aqt    4.1.7  47  135    0      0      0    2 S    HYDROLASE atp synthase fragment Mutant (escherichia c
 
  61: 3256-A 2gom-A  4.4  2.4  54    61    6      0      0    5 S    CELL ADHESION/TOXIN fibrinogen-binding protein fragmen
 
  62: 3256-A 2fez-A  4.4  3.2  88  373  11      0      0    9 S    TRANSCRIPTION probable regulatory protein embr (mycob
The conserved residues of the ligand binding site in 1MC0 were not consistent with the aligned residues in 2GNX.
  63: 3256-A 2d9d-A  4.3.7  75    89  13      0      0    7 S    CHAPERONE bag family molecular chaperone regulator 5 f
 
  64: 3256-A 1z1w-A  4.4 2.9  70  780    6      0      0    8 S    HYDROLASE tricorn protease interacting factor f3 (the
Zoraghi R. et al. (2003) indicated a fingerprint of the ligand binding site in 1MC0, which was the following patterns:
  65: 3256-A 1qoy-A  4.4  5.93   303  15      0     0    7 S    TOXIN hemolysin e (cytolysin a, silent hemolysin a, hl
 
  66: 3256-A 1ocr-C  4.4  2.9  73  261    7      0     0    4 S    OXIDOREDUCTASE cytochrome c oxidase (ferrocytochrome c
SX(13-18)FDX(18-22)IAX(21)[Y/N]X(2)VDX(2)TX(3)TX(19)[E/Q]
  67: 3256-A 1h3n-A  4.4  3.6  76  813  14      0      0    6 S    AMINOACYL-TRNA SYNTHETASE leucyl-trna synthetase (the
 
  68: 3256-A 1bk5-A  4.4  3.0   71  422    7      0      0    8 S    PROTEIN TRANSPORT karyopherin alpha fragment (importin
[[Image:Alignment.PNG|framed|'''Figure 7'''<BR>Fingerprint of the ligand binding site in 1MC0 (Zoraghi et al).The identical residues were coloured in red and the underline residues were the ones that missing in the PDB file.
  69: 3256-A 2c0s-A  4.3  2.0  51    64    6      0      0    2 S    TRANSFERASE conserved domain protein fragment (bacill
|none]]<BR>
  70: 3256-A 2aaw-A  4.3  2.9  71  205  11      0      0    6 S    TRANSFERASE glutathione s-transferase (plasmodium fal
 
  71: 3256-A 1wp1-A  4.3  1.8  51  456  16      0      0    2 S    MEMBRANE PROTEIN outer membrane protein oprm (drug-dis
The alignment above (Figure 7) indicated that the published patterns roughly fit into the protein sequence of 2GNX. The 3D structure analysis (figure ) revealed that some residues (in yellow) were likely not within the ligand binding pocket, however other residues (in red) were still potential ligand binding site.
  72: 3256-A 1oxj-A  4.3  3.4  70  170  11      0      0    8 S    RNA BINDING PROTEIN RNA-binding protein smaug fragment
 
  73: 3256-A 1k8t-A  4.3  5.3  77  498  12      0      0    6 S    TOXIN,LYASE calmodulin-sensitive adenylate cyclase (
[[Image:Ligand Qsite.png|framed|'''Figure 7.1'''<BR>Ligand Binding Site Predicted by Q-siteFinder
  74: 3256-A 1h2s-B  4.3  3.7  51    60    6      0      0    1 S    MENBRANE PROTEIN COMPLEX sensory rhodopsin ii fragment
|none]]<BR>
  75: 3256-A 1fiy    4.3  2.9  63  873    8      0      0    6 S    COMPLEX (LYASE/INHIBITOR) phosphoenolpyruvate carboxyl
 
  76: 3256-A 1a32    4.3  2.1  54    85  11      0      0    3 S    RIBOSOMAL PROTEIN ribosomal protein s15 (bacillus ste
The result from Q-siteFinder confirmed that there were probably protein binding pocket in the predicted region. However, the volume of the two pockets were small compare to a normal cGMP binding site (Zoraghi R, 2003).
  77: 3256-A 2iml-A  4.2  2.5  59  188    8      0      0    4 S    FLAVOPROTEIN hypothetical protein (archaeoglobus fulg
 
  78: 3256-A 2e9x-A  4.2  1.3  53  144    6      0      0    2 S    REPLICATION DNA replication complex gins protein psf1
[[Image:Binding pocket.png|framed|'''Figure 8'''<BR> Potential ligand binding sites in 2GNX.|none]]<BR>
  79: 3256-A 2avk-A  4.2  3.6  71  133    3      0      0    4 S    OXYGEN STORAGE/TRANSPORT hemerythrin-like domain prote
 
  80: 3256-A 1sz9-A  4.2  2.6  74  143    9      0      0    7 S    TRANSCRIPTION pcf11 protein fragment (saccharomyces c
The figure above (Figure 8) shows the residues that are identical to the published patterns. The residues in red are the potential ligand binding residues and the residues in yellow were the residues that matched the published data but are not likely to be in the ligand binding pocket in 2GNX.
  81: 3256-A 1fp3-A  4.2  6.0  107  402    5      0      0    14 S    ISOMERASE n-acyl-d-glucosamine 2-epimerase (sus scro
 
  82: 3256-A 2iiu-A  4.1  3.8  75  203    8      0      0    6 S    STRUCTURAL GENOMICS/UNKNOWN FUNCTION hypothetical prot
== Functional Analysis ==
  83: 3256-A 2i6h-A  4.1  3.8  92  176  10      0      0    12 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
STRING and CDART returned no results for the submitted protein data.
  84: 3256-A 2fuq-A  4.1  2.4  61  746  15      0      0    6 S    SUGAR BINDING PROTEIN heparinase ii protein (pedobact
 
  85: 3256-A 1y64-B  4.1  3.6  81  411    9      0      0    7 S    STRUCTURAL PROTEIN actin, alpha skeletal muscle (alpha
=== BlastP Results ===
  86: 3256-A 1rkc-A  4.1  5.2  76  258  11      0      0    7 S    CELL ADHESION, STRUCTURAL PROTEIN vinculin fragment ta
BlastP returned results however the results were limited to hypothetical proteins that gave no added information.
  87: 3256-A 1qqt-A  4.1  3.2  72  546    6      0      0    5 S    LIGASE methionyl-trna synthetase fragment (escherichi
 
  88: 3256-A 1nkd    4.1  2.0  51    59    2      0      0    2 S    TRANSCRIPTION REGULATION rop (cole1 repressor of prime
'''Table 4:''' BlastP Results
  89: 3256-A 1ee4-A  4.1  3.0  71  423    7      0      0    8 S    TRANSPORT PROTEIN karyopherin alpha fragment (serine-r
{| border="1"
  90: 3256-A 1a0f-A  4.1  2.4  63  201    5      0      0    5 S    TRANSFERASE glutathione s-transferase fragment (gst, g
|-
  91: 3256-A 2o8p-A  4.0  5.0  104  215  13      0      0    11 S    SIGNALING PROTEIN 14-3-3 domain containing protein (c
|
  92: 3256-A 2nw9-A  4.0  5.8  98  265    9      0      0    10 S    OXIDOREDUCTASE tryptophan 2,3-dioxygenase (xanthomona
|
  93: 3256-A 2jbw-A  4.0  4.2  69  348    9      0      0    7 S    HYDROLASE 2,6-dihydroxy-pseudo-oxynicotine hydrolase (
|
  94: 3256-A 2ibd-A  4.0  6.5  89  190    7      0      0    12 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION possible transcr
| Score (Bits)
  95: 3256-A 1yw0-A  4.0  4.8  84  238    8      0      0    6 S    OXIDOREDUCTASE tryptophan 2,3-dioxygenase (xanthomona
| E Value
  96: 3256-A 1yoz-A  4.0  4.3  66  113    8      0      0    6 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION hypothetical pro
|-
  97: 3256-A 1v7v-A  4.0  4.3  73  779    7      0      0    10 S    TRANSFERASE chitobiose phosphorylase (vibrio proteoly
| ref
  98: 3256-A 1tf4-A  4.0  4.3  97  605    6      0      0    12 S    GLYCOSYL HYDROLASE t. fusca endoEXO-CELLULASE E4 CATAL
| XP_001163972.1
  99: 3256-A 1qvr-A  4.0  1.5  49  803  12      0      0    3 S    CHAPERONE clpb protein (thermus thermophilus) bacteri
| PREDICTED: similar to FLJ32549 protein [Pan
| 850
| 0.0
|-
| ref
| XP_001116860.1
| PREDICTED: hypothetical protein isoform 1 [M
| 848
| 0.0
|-
| ref
| NP_689653.3
| hypothetical protein LOC144577 [Homo sapiens...
| 847
| 0.0
|-
| gb
| AAH36246.1
| FLJ32549 protein [Homo sapiens]
| 846
| 0.0
|-
| ref
| XP_001116875.1
| PREDICTED: hypothetical protein isoform 3 [M
| 843
| 0.0
|-
| ref
| XP_531657.2
| PREDICTED: hypothetical protein XP_531657 [Cani
| 827
| 0.0
|-
| ref
| XP_615557.3
| PREDICTED: hypothetical protein [Bos taurus]
| 823
| 0.0
|-
| gb
| EDL24424.1
| cDNA sequence BC048403, isoform CRA_a [Mus muscul
| 803
| 0.0
|-
| ref
| NP_766610.2
| hypothetical protein LOC270802 [Mus musculus...
| 803
| 0.0
|-
| ref
| XP_576234.2
| PREDICTED: hypothetical protein [Rattus norv...
| 802
| 0.0
|-
| ref
| XP_001364942.1
| PREDICTED: hypothetical protein [Monodelphis
| 797
| 0.0
|-
| ref
| XP_416063.1
| PREDICTED: hypothetical protein [Gallus gallus]
| 796
| 0.0
|-
| dbj
| BAC39804.1
| unnamed protein product [Mus musculus]
| 760
| 0.0
|-
| ref
| XP_001116868.1
| PREDICTED: hypothetical protein isoform 2 [M
| 743
| 0.0
|-
| ref
| NP_001085035.1
| hypothetical protein LOC432102 [Xenopus l...
| 697
| 0.0
|-
| ref
| NP_001025261.1
| hypothetical protein LOC555715 [Danio rer...
| 665
| 0.0
|-
| ref
| NP_001076454.1
| hypothetical protein LOC100005809 [Danio ...
| 661
| 0.0
|-
| ref
| XP_001331282.1
| PREDICTED: hypothetical protein [Danio rerio
| 598
| 2e-169
|-
| emb
| CAG12393.1
| unnamed protein product [Tetraodon nigroviridis]
| 593
| 8e-168
|-
| pdb
| 2GNX
| A  Chain A, X-Ray Structure Of A Hypothetical Protein...  
| 554
| 3e-156
|-
| dbj
| BAE41440.1
| unnamed protein product [Mus musculus]
| 508
| 2e-142
|-
| ref
| NP_001038719.1
| hypothetical protein LOC692281 [Danio rer...  
| 357
| 1e-96
|-
| ref
| XP_624797.1
| PREDICTED: hypothetical protein [Apis mellifera
| 235
| 3e-60
|-
| ref
| XP_974676.1
| PREDICTED: hypothetical protein [Tribolium cast
| 232
| 5e-59
|-
| ref
| XP_001193974.1
| PREDICTED: hypothetical protein [Strongyloce
| 208
| 5e-52
|-
| ref
| XP_797380.2
| PREDICTED: hypothetical protein, partial [St...
| 207
| 2e-51
|-
| dbj
| BAE37112.1
| unnamed protein product [Mus musculus] >dbj B...
| 134
| 2e-29
|-
| gb
| EDL24425.1
| cDNA sequence BC048403, isoform CRA_b [Mus muscul
| 132
| 6e-29
|-
| ref
| XP_642387.1
| hypothetical protein DDBDRAFT_0205477 [Dicty...
| 87.8
| 1e-15
|-  
| emb
| CAJ08583.1
| hypothetical protein, conserved [Leishmania majo
| 36.6
| 3.5
|-
|}
 
=== Method Predicted Subcellular Location Evaluation  ===
Locate analysis predicted that the protein is a soluble non-secreted protein. Localisation data was diverse as follows:
 
'''Table 5:''' Method Predicted Subcellular Location Evaluation  
{|border="1"
|-
|Method
|Location
|Score
|-
|CELLO
|Mitochondrion
|1.34
|-
|CELLO
|Extracellular region
|1.08
|-
|pTarget
|Endoplasmic reticulum
|93.90
|-
|Proteome Analyst
|No prediction
|0.00
|-
|WoLFPSORT
|Cytoplasm
|13.00
|-
|WoLFPSORT
|Nucleus
|12.00
|-
|WoLFPSORT
|Golgi apparatus
|3.00
|-
|MultiLoc
|Peroxisome
|0.49
|-
|MultiLoc
|Mitochondrion
|0.23
|-
|MultiLoc
|Extracellular region
|0.09
|-
|}
 
=== BC048403 Symatlas Expression Profile ===
Pfam, Profunc, Proknow, and Interpro all returned no results for the protein 2gnxA. However, Symatlas did provide an interesting lead. The expression data is presented in the following diagram. However, the significant results were the number of olfactory receptors with correlated expression profiles.
[[Image:Symatlas bc048403 1.GIF|framed|'''Figure 9'''<BR> Symatlas Expression Profile.<BR>|none]]
 
===Co-occurring Motifs Corresponding to BC048403 ===
Olfactory receptors were also encountered when the protein was submitted to cis-RED to retrieve the corresponding cis-regulatory motif patterns. All fourteen motif patterns or modules, corresponding to the BC048403 protein are also motif patterns that are found in many different olfactory receptors. Motifs are predicted by cisRED with p-values < 0.005.
 
In total, the fourteen motifs corresponded to 120 different olfactory receptors. The following table lists the olfactory receptors with 3 or more co-occurring motifs. The header row lists the fourteen modules. Highlighted in orange (nine co-occurring modules) and green (7 co-occurring modules), are the olfactory receptors having the most modules in common with the BC048403 protein.
 
'''Table 6:''' Co-occurring Motifs Corresponding to Olfactory Receptors
[[Image:Olf motif table.GIF]]
 
=== Number of Motifs Corresponding to each Olfactory Receptor ===
The following graph represents the number of co-occurring motifs across the entire range of 120 corresponding olfactory receptors.
[[Image:Olf motif graph.GIF |framed|'''Figure 10'''<BR> Graph of the number of motifs corresponding to each olfactory receptor.<BR>|none]]
 
These motifs were searched for in the other species databases of cis-RED however they were not found as there is no inter-species search tool. Unfortunately, micro-array expression data for the olfactory receptors with the most co-occurring motifs, were unavailable.
 
=== Micro-array Expression Profiles Similar to FLJ32549 ===
The following micro-array data was found by browsing through the profile neighbours of the human ortholog using GEO Profiles.
[[Image:Flj32549 expression profiles.GIF|framed|'''Figure 11'''<BR> Neighbouring expression profiles to FLJ32549.|none]]
 
Other interesting motifs found to appear in the Bc048403 protein were motifs that corresponded to the cadherin family.
 
==Return to [[report]]==

Latest revision as of 03:14, 12 June 2007

Evolutionary analysis

Figure 1
This image shows part of a complete alignment of the sequences used. Asterisks (*) indicate residues that are conserved across all sequences, and colons (:) indicate partial conservation across all sequences.


Figure 2
The phylogenetic tree shows how close the relationships between the sequences are. The longer the branches of the tree the more evolutionary divergent the sequences are. 2GNX A is the original protein being investigated and was a mouse protein. The branches with marked with * indicate that this branch arrangement occured more then 75% of the time.


Structural analysis

An analysis of the secondary structure of the protein from its amino acid sequence (Figure 3) shows the secondary structural arrangement of different regions of our protein

Figure 3
Secondary structure analysis of the 2GNX protein from Protein Data Bank.


Table 1 Dali analysis of the 2GNX protein

NR. STRID1 STRID2  Z   RMSD LALI LSEQ2 %IDE REVERS PERMUT NFRAG TOPO PROTEIN
 1: 3023-A 2gnx-A 42.9  0.0  280   280  100      0      0     1 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION 	hypothetical pro
 2: 3023-A 2cmr-A  5.7  3.5  114   192   11      0      0    11 S    IMMUNOGLOBULIN COMPLEX 	d5 (fab heavy chain) d5 (fab li
 3: 3023-A 1j3w-A  5.7  3.2   99   134   12      0      0     9 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION 	giding protein-m
 4: 3023-A 1jmr-A  5.5  3.0   94   246    9      0      0    12 S    
 5: 3023-A 1f5m-B  5.5  5.0  107   177    9      0      0    13 S    SIGNALING PROTEIN 	gaf 	(saccharomyces cerevisiae) yeas
 6: 3023-A 1vcs-A  5.0  4.7   82   102    9      0      0     8 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION 	vesicle transpor
 7: 3023-A 1kt0-A  4.9  2.8   81   357    6      0      0     7 S    ISOMERASE 	51 kda fk506-binding protein (fkbp51) Mutant
 8: 3023-A 1e2a-A  4.9  4.5   80   102    9      0      0     6 S    TRANSFERASE 	enzyme iia (enzyme iii, lactose-specific i
 9: 3023-A 2d2s-A  4.8  3.1   75   217   11      0      0     5 S    ENDOCYTOSIS/EXOCYTOSIS 	exocyst complex component exo84
10: 3023-A 2oew-A  4.7  2.8  119   358    8      0      0    12 S    PROTEIN TRANSPORT 	programmed cell death 6-interacting 
11: 3023-A 1h3q-A  4.7  4.2   92   140    4      0      0    11 S    TRANSPORT 	sedlin (sedl) 	(mus musculus) mouse 	S.B.Jan
12: 3023-A 2oev-A  4.5 36.5  151   697    7      0      0    14 S    PROTEIN TRANSPORT 	programmed cell death 6-interacting 
13: 3023-A 2cwy-A  4.5  2.4   82    92   20      0      0     7 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION 	hypothetical pro
14: 3023-A 2c5i-T  4.5  2.8   75    93   11      0      0     5 S    PROTEIN TRANSPORT/COMPLEX 	t-snare affecting a late gol
15: 3023-A 3nul    4.4  3.4   93   130    5      0      0    11 S    ACTIN-BINDING PROTEIN 	profilin i 	(arabidopsis thalian

A Dali analysis (Table 1) of the 2GNX protein was highly inconclusive and there were no significant structural matches to the hypothetical protein.

Table 2 Dali analysis of N-terminal domain

 NR. STRID1 STRID2  Z   RMSD LALI LSEQ2 %IDE REVERS PERMUT NFRAG TOPO PROTEIN
  1: 3256-A 2gnx-A 23.2  0.0  173   280  100      0      0     1 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION 	hypothetical pro
  2: 3256-A 1e2a-A  7.5  4.5   80   102    9      0      0     6 S    TRANSFERASE 	enzyme iia (enzyme iii, lactose-specific i
  3: 3256-A 1kt0-A  7.4  2.8   81   357    6      0      0     7 S    ISOMERASE 	51 kda fk506-binding protein (fkbp51) Mutant
  4: 3256-A 2d2s-A  7.3  3.1   75   217   11      0      0     5 S    ENDOCYTOSIS/EXOCYTOSIS 	exocyst complex component exo84
  5: 3256-A 1vcs-A  7.3  4.7   78   102    9      0      0     7 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION 	vesicle transpor
  6: 3256-A 2cmr-A  6.9  3.2  104   192   11      0      0     9 S    IMMUNOGLOBULIN COMPLEX 	d5 (fab heavy chain) d5 (fab li
  7: 3256-A 2c5i-T  6.9  2.8   75    93   11      0      0     5 S    PROTEIN TRANSPORT/COMPLEX 	t-snare affecting a late gol
  8: 3256-A 2h7o-A  6.8  3.0   81   270    5      0      0     7 S    SIGNALING PROTEIN 	protein kinase ypka fragment (protei
  9: 3256-A 2h7v-C  6.6  4.2   76   269   13      0      0     5 S    SIGNALING PROTEIN 	migration-inducing protein 5 (ras-re
 10: 3256-A 2dnx-A  6.5  4.9   80   130    6      0      0     6 S    TRANSPORT PROTEIN 	syntaxin-12 fragment 	(homo sapiens)
 11: 3256-A 1hg5-A  6.5  3.2   85   263    9      0      0     6 S     ENDOCYTOSIS 	clathrin assembly protein short form frag
 12: 3256-A 1a17    6.4  2.5   71   159    3      0      0     5 S    HYDROLASE 	serineTHREONINE PROTEIN PHOSPHATASE 5 fragme
 13: 3256-A 2if4-A  6.3  2.5   82   258    7      0      0     7 S    SIGNALING PROTEIN 	atfkbp42 fragment (twd1 (twisted dwa
 14: 3256-A 1owa-A  6.2  3.3   76   156   12      0      0     6 S    CYTOKINE 	spectrin alpha chain, erythrocyte fragment (e
 15: 3256-A 2oew-A  6.1  2.8  119   358    8      0      0    12 S    PROTEIN TRANSPORT 	programmed cell death 6-interacting 

A Dali analysis carried out separately with only the N-terminal domain (Table 2) of the protein also did not produce any significant structural matches.

Figure 4
2CMR-2GNX alignment (2CMR displayed in cyans and 2GNX displayed in green).


A CE alignment between IMMUNOGLOBULIN COMPLEX d5 (2CMR) and 2GNX was performed (Figure 4). The result revealed that the N-terminus of 2GNX matched 2CMR:A which was a TRANSMEMBRANE GLYCOPROTEIN, with Rmsd = 3.8Å and Z-Score = 3.7. The 3D figure showed that two proteins both had five-helix strucuture and they were well fitted. However, the function of this 5-helix stucture was not clear.

Table 3: Dali analysis of C-terminal domain

NR. STRID1 STRID2  Z   RMSD LALI LSEQ2 %IDE REVERS PERMUT NFRAG TOPO PROTEIN
  1: 3257-A 2gnx-A 24.3  0.0  118   280  100      0      0     1 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION 	hypothetical pro
  2: 3257-A 1jmr-A  7.6  3.0   94   246    9      0      0    12 S    
  3: 3257-A 1j3w-A  7.5  2.9   91   134   13      0      0     7 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION 	giding protein-m
  4: 3257-A 1f5m-B  6.8  2.9   95   177    9      0      0    10 S    SIGNALING PROTEIN 	gaf 	(saccharomyces cerevisiae) yeas
  5: 3257-A 1h3q-A  6.6  4.2   92   140    4      0      0    11 S    TRANSPORT 	sedlin (sedl) 	(mus musculus) mouse 	S.B.Jan
  6: 3257-A 3nul    6.3  3.4   93   130    5      0      0    11 S    ACTIN-BINDING PROTEIN 	profilin i 	(arabidopsis thalian
  7: 3257-A 1mc0-A  5.8  4.1   99   341    8      0      0    11 S    HYDROLASE 	3',5'-cyclic nucleotide phosphodiesterase 2a
  8: 3257-A 2h28-A  5.4  2.8   75   106    8      0      0    10 S    STRUCTURAL GENOMICS, UNKNOWN FUNCTION 	hypothetical pro
  9: 3257-A 2p7j-A  5.0  2.9   79   262   13      0      0    11 S    TRANSCRIPTION 	putative sensory boxGGDEF FAMILY PROTEIN
 10: 3257-A 2dmw-A  5.0  3.3   85   131    7      0      0    11 S    MEMBRANE PROTEIN 	synaptobrevin-like 1 variant fragment
 11: 3257-A 2avx-A  4.8  3.6   93   171    5      0      0    10 S    TRANSCRIPTION 	regulatory protein sdia Mutant 	(escheri
 12: 3257-A 2j3t-C  4.7  5.2   83   141    7      0      0     8 S    PROTEIN TRANSPORT 	trafficking protein particle complex
 13: 3257-A 2hj9-C  4.7  3.3   76   210    5      0      0     9 S    SIGNALING PROTEIN 	autoinducer 2-binding periplasmic pr
 14: 3257-A 2hje-A  4.6  3.0   75   210    5      0      0     9 S    SIGNALING PROTEIN 	autoinducer 2 sensor kinasePHOSPHATA
 15: 3257-A 2uv0-E  4.5  3.5   93   159    9      0      0    12 S    TRANSCRIPTION 	transcriptional activator protein lasr

However, a Dali analysis (Table 3) carried out with the C-terminal domain of the protein produced one significant structural match, this being the GAF signalling protein, i.e the 4th result in the Dali analysis.


Figure 5
Dotlet analysis for 2GNX.


The Dotlet analysis (Figure 5) showed that there was no internally homologous repeats in the C-terminus of 2GNX.


 USR1:A  185/392   QVAKNLFTH---LDDVSVLLQEIITEARNLSNAEICSVFLLDQ-----------------
 USR2:A  181/283   TASEXKALTAKANPDLFGKISSFIRKY------DAANVSLIFDNRGSESFQGHGYHHPHS
 USR1:A  225/432   ----------NELVAKVFDGGVVDDESYEIRIPADQGIAGHVATTG----------QILN
 USR2:A  235/#44   YREAPKGVDQYPAVVSLP----------SDRPVXHWPNVIXIXTDRASDLNSLEKVVHFY
 USR1:A  265/472   IPDAYAHPLFYRGVDDSTGFRTRNILCFPIKNENQEVIGVAELVNKINGPWFSKFDEDLA
 USR2:A  285/387   DDKV-------------------QSTYFLTRPEP-HFTIVVIFESK---------KSERD
 USR1:A  325/532   TAFSIYCGISIAHSLL
 USR2:A  316/418   SHFISFLNELSLALKN

Figure 6: CE predicted structural alignment. USR1 = 1MC0(PDB code), Regulatory Segment of Mouse 3',5'-Cyclic Nucleotide Phosphodiesterase 2A, Containing the GAF A and GAF B Domains. USR2= 2GNX


The conserved residues of the ligand binding site in 1MC0 were not consistent with the aligned residues in 2GNX.

Zoraghi R. et al. (2003) indicated a fingerprint of the ligand binding site in 1MC0, which was the following patterns:

SX(13-18)FDX(18-22)IAX(21)[Y/N]X(2)VDX(2)TX(3)TX(19)[E/Q]

Figure 7
Fingerprint of the ligand binding site in 1MC0 (Zoraghi et al).The identical residues were coloured in red and the underline residues were the ones that missing in the PDB file.


The alignment above (Figure 7) indicated that the published patterns roughly fit into the protein sequence of 2GNX. The 3D structure analysis (figure ) revealed that some residues (in yellow) were likely not within the ligand binding pocket, however other residues (in red) were still potential ligand binding site.

Figure 7.1
Ligand Binding Site Predicted by Q-siteFinder


The result from Q-siteFinder confirmed that there were probably protein binding pocket in the predicted region. However, the volume of the two pockets were small compare to a normal cGMP binding site (Zoraghi R, 2003).

Figure 8
Potential ligand binding sites in 2GNX.


The figure above (Figure 8) shows the residues that are identical to the published patterns. The residues in red are the potential ligand binding residues and the residues in yellow were the residues that matched the published data but are not likely to be in the ligand binding pocket in 2GNX.

Functional Analysis

STRING and CDART returned no results for the submitted protein data.

BlastP Results

BlastP returned results however the results were limited to hypothetical proteins that gave no added information.

Table 4: BlastP Results

Score (Bits) E Value
ref XP_001163972.1 PREDICTED: similar to FLJ32549 protein [Pan 850 0.0
ref XP_001116860.1 PREDICTED: hypothetical protein isoform 1 [M 848 0.0
ref NP_689653.3 hypothetical protein LOC144577 [Homo sapiens... 847 0.0
gb AAH36246.1 FLJ32549 protein [Homo sapiens] 846 0.0
ref XP_001116875.1 PREDICTED: hypothetical protein isoform 3 [M 843 0.0
ref XP_531657.2 PREDICTED: hypothetical protein XP_531657 [Cani 827 0.0
ref XP_615557.3 PREDICTED: hypothetical protein [Bos taurus] 823 0.0
gb EDL24424.1 cDNA sequence BC048403, isoform CRA_a [Mus muscul 803 0.0
ref NP_766610.2 hypothetical protein LOC270802 [Mus musculus... 803 0.0
ref XP_576234.2 PREDICTED: hypothetical protein [Rattus norv... 802 0.0
ref XP_001364942.1 PREDICTED: hypothetical protein [Monodelphis 797 0.0
ref XP_416063.1 PREDICTED: hypothetical protein [Gallus gallus] 796 0.0
dbj BAC39804.1 unnamed protein product [Mus musculus] 760 0.0
ref XP_001116868.1 PREDICTED: hypothetical protein isoform 2 [M 743 0.0
ref NP_001085035.1 hypothetical protein LOC432102 [Xenopus l... 697 0.0
ref NP_001025261.1 hypothetical protein LOC555715 [Danio rer... 665 0.0
ref NP_001076454.1 hypothetical protein LOC100005809 [Danio ... 661 0.0
ref XP_001331282.1 PREDICTED: hypothetical protein [Danio rerio 598 2e-169
emb CAG12393.1 unnamed protein product [Tetraodon nigroviridis] 593 8e-168
pdb 2GNX A Chain A, X-Ray Structure Of A Hypothetical Protein... 554 3e-156
dbj BAE41440.1 unnamed protein product [Mus musculus] 508 2e-142
ref NP_001038719.1 hypothetical protein LOC692281 [Danio rer... 357 1e-96
ref XP_624797.1 PREDICTED: hypothetical protein [Apis mellifera 235 3e-60
ref XP_974676.1 PREDICTED: hypothetical protein [Tribolium cast 232 5e-59
ref XP_001193974.1 PREDICTED: hypothetical protein [Strongyloce 208 5e-52
ref XP_797380.2 PREDICTED: hypothetical protein, partial [St... 207 2e-51
dbj BAE37112.1 unnamed protein product [Mus musculus] >dbj B... 134 2e-29
gb EDL24425.1 cDNA sequence BC048403, isoform CRA_b [Mus muscul 132 6e-29
ref XP_642387.1 hypothetical protein DDBDRAFT_0205477 [Dicty... 87.8 1e-15
emb CAJ08583.1 hypothetical protein, conserved [Leishmania majo 36.6 3.5

Method Predicted Subcellular Location Evaluation

Locate analysis predicted that the protein is a soluble non-secreted protein. Localisation data was diverse as follows:

Table 5: Method Predicted Subcellular Location Evaluation

Method Location Score
CELLO Mitochondrion 1.34
CELLO Extracellular region 1.08
pTarget Endoplasmic reticulum 93.90
Proteome Analyst No prediction 0.00
WoLFPSORT Cytoplasm 13.00
WoLFPSORT Nucleus 12.00
WoLFPSORT Golgi apparatus 3.00
MultiLoc Peroxisome 0.49
MultiLoc Mitochondrion 0.23
MultiLoc Extracellular region 0.09

BC048403 Symatlas Expression Profile

Pfam, Profunc, Proknow, and Interpro all returned no results for the protein 2gnxA. However, Symatlas did provide an interesting lead. The expression data is presented in the following diagram. However, the significant results were the number of olfactory receptors with correlated expression profiles.

Figure 9
Symatlas Expression Profile.

Co-occurring Motifs Corresponding to BC048403

Olfactory receptors were also encountered when the protein was submitted to cis-RED to retrieve the corresponding cis-regulatory motif patterns. All fourteen motif patterns or modules, corresponding to the BC048403 protein are also motif patterns that are found in many different olfactory receptors. Motifs are predicted by cisRED with p-values < 0.005.

In total, the fourteen motifs corresponded to 120 different olfactory receptors. The following table lists the olfactory receptors with 3 or more co-occurring motifs. The header row lists the fourteen modules. Highlighted in orange (nine co-occurring modules) and green (7 co-occurring modules), are the olfactory receptors having the most modules in common with the BC048403 protein.

Table 6: Co-occurring Motifs Corresponding to Olfactory Receptors Olf motif table.GIF

Number of Motifs Corresponding to each Olfactory Receptor

The following graph represents the number of co-occurring motifs across the entire range of 120 corresponding olfactory receptors.

Figure 10
Graph of the number of motifs corresponding to each olfactory receptor.

These motifs were searched for in the other species databases of cis-RED however they were not found as there is no inter-species search tool. Unfortunately, micro-array expression data for the olfactory receptors with the most co-occurring motifs, were unavailable.

Micro-array Expression Profiles Similar to FLJ32549

The following micro-array data was found by browsing through the profile neighbours of the human ortholog using GEO Profiles.

Figure 11
Neighbouring expression profiles to FLJ32549.

Other interesting motifs found to appear in the Bc048403 protein were motifs that corresponded to the cadherin family.

Return to report