Talk:Protein Function

From MDWiki
Revision as of 05:05, 22 May 2007 by S4079195 (talk | contribs)
Jump to navigationJump to search

Information From 8th May 2007

MIF4G is the middle domain of eukaryotic initiation factor 4G (eIF4G). It also occurs in NMD2p (non-sense mediated mRNA decay protein - it is involved in the non-sense mediated decay of mRNAs containing premature stop codons) and in CBP80 (Cap binding protein).

The protein binds eIF4A, eIF3, RNA and DNA Therefore part of function is to bind to RNA

Possibly located in the cytoplasm - See link to LOCATE. Mouse protein of similar seuqence in this location.

MIF4G starts residue 28 Ends 240 (mouse)

It is soluble and non-secreted.

PA74324.2 Riken cDNA 2310075612 Rik Protein - AAH26740, AAH55812(mouse), AAH33759(human)

AAH55812 - Rik Protein Mouse. Present in the cerebellum, Striatum, Eye, Wholebrain, Liver, Hippocampus, Hematopoietic Stem Cells and Kidney Accession No: BC055812.1

Performed a MultiLoc prediction that determines location of the protein based on Amino Acid sequence and the presence etc of a N-termial targeting sequence. There is a 0.93 Probability that the protein is cytoplasmic. Now I have to find specific location, what the protein binds to and the structure of what it binds to. If i can identify the structure of the binding domain then I can predict to some extent the structure or a very small piece of the structure ie active site and can use this to perform function based analysis?

Error creating thumbnail: File missing
Binding Site Analysis From ProFunc Using Human Sequence


ProFunc Analysis:

http://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/profunc/GetResults.pl?source=profunc&user_id=ay65&code=075103

Showed that the domain contains an ARM repeat. Further research into this will be done. Eliza found the same thing.

This shows that there are many binding sites. To get to this image follow the link under the Cleft Sites analysis on the ProFunc results page.

Still need to ID what is the significance of all the results uncovered by ProKnow

Will go into this more next week

But it is interesting to know for the time being that both eliza and I have found that the function has something to do with the methylated cap on RNA and that it is this process with-in the cytoplasm (as opposed to in the nucleus).


ProKnow Analysis

Table shows that the most likely molecular function for our protein is RNA binding this is infered by genetic interaction. Most of the Biological processes ID'd are from a traceable author statement. Number of clues are 6 and 4 respectively. 1-2 is considered weak therefore 4 and 6 probably arent greatly significant but perhaps high enough to make some inferrences.

When looking at the Master Table from results - note the following:

 *Clue 1 Frequency of the ontolgies obtained from Blast hits 
 *Clue 2 Score for the ontology from Blast Evalues. The best evalue available for the ontology is taken (only 4 digits after decimal is shown). 
 *Clue 3 Frequency of ontologies from 3D motifs 
 *Clue 4 Score of ontologies from 3D motifs based on conservation. It is the average of scores from the motifs associated with the ontology. 
 *Clue 5 Score of ontologies from 3D folds. The best Z_score available for the function is taken. 
 *Clue 6 Frequency of ontologies from 3D folds 
 *Clue 7 Frequency of ontologies from DIP search 
 *Clue 8 Score of ontologies from PROSITE search based on conservation. It is the average of scores from the motifs associated with the ontology. 
 *Clue 9 Frequency of ontologies from PROSITE search 
 *Clue 10 Frequency of ontologies from PROLINKS search 


Our results had a very high reading in clue 8. Does this mean that the sequ is highly conserved??


Article about eIF4GIII Protein - http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Abstract&list_uids=11172724


Obtained Sequences

Human - Protein Sequ

mgepsreeyk iqsfdaetqq llktalkvac fetedgeysv cqrsysncsr lmpsrcntqy

rdpgavdlek vanvivdhsl qdcvfskeag rmcyaiiqae skqagqsvfr rgllnrlqqe

yqareqlrar slqgwvcyvt ficnifdylr vnnmpmmalv npvydclfrl aqpdslskee

evdclvlqlh rvgeqlekmn gqrmdelfvl irdgfllptg lsslaqllll eiiefraagw

kttpaahkyy ysevsd


>AAH26740 ARM repeat, position: 13-208 (Mouse)

SFDAQTQQLLKTALKDPGAVDLERVANVIVDHSLQDCVFSKEAGRMCYAIIQAESKQAGQSVFRRGLLNRLQKEYDAREQ

LRACSLQGWVCYVTFICNIFDYLRVNNMPMMALVNPVYDCLFQLAQPESLSREEEVDCLVLQLHRVGEQLEKMNGQRMDE

LFILIRDGFLLPTDLSSLARLLLLEMIEFRAAGWK


Mouse - Protein

mseasrddyk iqsfdaetqq llktalkdps avdlervanv ivdhslqdcv fskeagrmcy

aiiqaeskqa gqsvfrrgll nrlqkeydar eqlracslqg wvcyvtficn ifdylrvnnm

pmmalvnpvy dclfqlaqpe slsreeevdc lvlqlhrvge qlekmngqrm delfilirdg

fllptdlssl arllllemie fraagwkttp aahkyyysev sd


FASTA - Human

>gi|21707112|gb|AAH33759.1| MIF4G domain containing [Homo sapiens]

MGEPSREEYKIQSFDAETQQLLKTALKVACFETEDGEYSVCQRSYSNCSRLMPSRCNTQYRDPGAVDLEK

VANVIVDHSLQDCVFSKEAGRMCYAIIQAESKQAGQSVFRRGLLNRLQQEYQAREQLRARSLQGWVCYVT

FICNIFDYLRVNNMPMMALVNPVYDCLFRLAQPDSLSKEEEVDCLVLQLHRVGEQLEKMNGQRMDELFVL

IRDGFLLPTGLSSLAQLLLLEIIEFRAAGWKTTPAAHKYYYSEVSD