|Description:||Transcription factor SOX-2|
Homo sapiens (Human)
(NCBI taxonomy ID
View Pfam proteome data.
|Length:||317 amino acids|
Please note: when we start each new Pfam data release, we take a copy of the UniProt sequence database. This snapshot of UniProt forms the basis of the overview that you see here. It is important to note that, although some UniProt entries may be removed after a Pfam release, these entries will not be removed from Pfam until the next Pfam data release.
This image shows the arrangement of the Pfam domains that we found on this sequence. Clicking on a domain will take you to the page describing that Pfam entry. The table below gives the domain boundaries for each of the domains. More...
E-values are based on searching the Pfam-A family against UniProtKB 2012_06 using hmmsearch.
|Source||Domain||Start||End||Gathering threshold (bits)||Score (bits)||E-value|
or domain scores.
This section shows a graphical representation of this sequence, with Pfam domains shown in the standard Pfam format. Under the Pfam domain image we show various tracks, illustrating features on this sequence that we found in other databases. You can choose which databases to include using the drop-down panel under the image. More...
We generate the topmost image from data in the Pfam database, but subsequent images are constructed on-the-fly using data retrieved from other sources using the Distributed Annotation System (DAS). DAS is a system for sharing sequence annotations in a standard format and we use it to find features and sequence information from a wide variety of other sources, from UniProt to InterPro to Superfamily.
Each DAS source is represented in a new track, although some sources may generate more than one track, if they have features which overlap. Each feature that we find is shown as a simple box, positioned according to its residue position on the sequence.
Moving your mouse over a feature in the display will highlight the feature and show a tooltip giving details of the feature. If the DAS source supplied a URL, you can also click on the feature to visit that URL. In some browsers you will also see a thin, vertical cursor, which follows the mouse and shows the residue position within the sequence.
You can turn DAS sources on and off using the control panel under the sequence images. Check the boxes for the DAS sources that you want to see; uncheck those that you are not interested in. Press Update to query the new set of DAS sources and re-generate the image. Note that if you have lots of sources turned on, the time taken to generate the images will increase. You can see the homepage for each of the DAS sources by clicking on its name in the update panel. The "source" link next to each source points directly at the DAS source. Depending on how the source is configured, that link may return some usage information or simply an XML fragment with the response to the empty query that you just made.
Please note: this is an experimental feature and there are several known bugs and limitations. Please be patient as we improve the tool.
Note: it can take a few seconds for this image to be generated and loaded.
sources update panel.
Use the check-boxes below to select the sources that you wish to query, then hit "Update" to re-generate the image. Please note that the data for the image are retrieved from servers around the web and it may take a few seconds to collect the data and generate the image.
Note that some DAS sources may not return any features on this sequence. These sources are highlighted in the list below.
This is the amino acid sequence of the UniProt sequence database entry with the accession P48431. This sequence is stored in the Pfam database and updated with each new Pfam release, but this means that the sequence we store may differ from that stored by UniProt.
MYNMMETELK PPGPQQTSGG GGGNSTAAAA GGNQKNSPDR VKRPMNAFMV
WSRGQRRKMA QENPKMHNSE ISKRLGAEWK LLSETEKRPF IDEAKRLRAL
HMKEHPDYKY RPRRKTKTLM KKDKYTLPGG LLAPGGNSMA SGVGVGAGLG
AGVNQRMDSY AHMNGWSNGS YSMMQDQLGY PQHPGLNAHG AAQMQPMHRY
DVSALQYNSM TSSQTYMNGS PTYSMSYSQQ GTPGMALGSM GSVVKSEASS
SPPVVTSSSH SRAPCQAGDL RDMISMYLPG AEVPEPAAPS RLHMSQHYQS
the unformatted sequence.
For those sequences which have a structure in the Protein DataBank, we use the mapping between UniProt, PDB and Pfam coordinate systems from the MSD group, to allow us to map Pfam domains onto UniProt three-dimensional structures. The table below shows the mapping between Pfam domains, this UniProt entry and a corresponding three dimensional structure.
|Pfam family||UniProt residues||PDB ID||PDB chain ID||PDB residues||View|
|HMG_box||41 - 109||1O4X||B||208 - 276||Jmol AstexViewer SPICE|
|2LE4||A||4 - 72||Jmol AstexViewer SPICE|
|SOXp||110 - 118||2LE4||A||73 - 81||Jmol AstexViewer SPICE|
Below is a phylogenetic tree of animal genes, with ortholog and paralog assignments, from TreeFam.