This website uses cookies. By continuing to browse this site, you are agreeing to the use of our site cookies. To find out more, see our cookie policy. ok no

Quick introduction


Quick start


Enter a search term in the "Search for a protein" field and press search. Click the "view" button for the protein of interest.


Input options


The "Search" field is linked to the UniProt protein search engine (Fig.1). Use any term to search for the protein in the UniParc database (e.g. Cyclin-dependent kinase inhibitor 1B). To increase the relevance of retrieved UniParc entries additional terms, like species and gene name, can be added. If the retrieved list doesn't include the protein of interest, the UniProt accession (e.g. P46527) or UniProt ID (e.g. CDKN1B) can be used. After the search results are retrieved, select the protein of interest from the result list. Valid searches are for example:

  • CDKN1B
  • Cyclin-dependent kinase inhibitor 1B
  • Cyclin-dependent kinase inhibitor p27
  • p27 human
  • Kip1
  • CDN1B_HUMAN
  • P46527


Input search
Fig. 1. ProViz protein search. (1) Protein search field to input search term. (2) Search button to submit search request. (3) View buttons to create visualisation of shown protein.

Custom input

The "Custom alignment" input option allows for entering a plain text or uploading a file. These have to be in FASTA format and can be either one single protein or a multiple sequence alignment. Here the first sequence is used as the query sequence.

Custom alignment Fig. 2. ProViz custom alignment input. (1) Text field to input protein sequence or multiple sequence alignment in FASTA format. (2) Submit button to create the ProViz visualisation with the alignment provided. (3) File uploader to provide a protein sequence or multiple sequence alignment by uploading a FASTA file.

URL input

To access the main visualisation directly a set of URL options can be used. An extensive list is available in the "URL options" section.


Main visualisation


Overview


Main visualisation Fig. 3. ProViz main visualisation. ProViz visualisation for Cyclin-dependent kinase inhibitor 1A (CDKN1A) showing selected features of CDKN1A and a GeneTree alignment of CDKN1A orthologues. Key aspects of the visualisation are numbered: (1) Protein name and species; (2) options sidebar; (3) data information sidebar; (4) data select, hide and help buttons; (5) information hover tooltip; (6) options toolbar; (7) protein architecture overview; (8) protein sequence data; (9) protein feature data. The visualisation in the example can be viewed here: Example

Alignment

The alignment section of the main visualisation shows the query sequence and if available a homologue alignment from Quest for Orthologues or GeneTree. It shows the name of the species on the left panel with links to UniProt or Ensemble. The right side displays the alignment coloured in the ClustalX scheme.

Features

The feature section shows tracks containing information associated with the query protein. These tracks are represented in rows which are grouped by type of data shown, with the left part describing the type of data shown and the right showing the data in one of three available formats. Features mapping to a continuous segment of the query protein (e.g. domains or transmembrane regions) are displayed as horizontal bars spanning the corresponding residues. Peptide tracks are similar to bar tracks, but display amino acids aligned with the corresponding residue in the query protein. Histogram tracks display quantitative data for the protein on a residue by residue basis. Data is displayed as vertical bars corresponding to the value given to the residue.

Tool bar

On the top left the name of the protein and gene together with the species is displayed. On the right hand side the user can choose between different alignments. The buttons right of the alignment selector activate panel for additional information like uncollapsing the alignment, searching for residues by regular expression, showing the compact view, highlighting areas of interest, protein architecture overview and recoloring of the alignment. There are reset and home buttons at the end.


Options sidebar Fig. 4. Options sidebar. Six sections are available, (1) Session, (2) Sequence, (3) Architecture, (4) Selection, (5) Add new tracks and (6) Download.

Options sidebar

The options panel containes options to modify different aspects of the main visualisation. The session section allows the user to create or reset a session and download a PDF of the visualisation. In the sequence section contains the comapt view option. The architecture overview can be enabled in the architecture section. The selection area containes controls for the slider, focus and resize options. The add new track section allows the user to add a custom feature file by drag/drop or file upload. Download options for the sequence data is available in the download section.


Proteins sidebar Fig. 5. Proteins sidebar. (1) Show all proteins. (2) Hide all proteins. (3) List of hidden proteins with button to restore them.

Proteins sidebar

The protein tab of the sidebar contains a list of all hidden proteins and allows the user to restore each individually or hide and show all sequences at once.


Features sidebar Fig. 6. Features sidebar. (1) Show all features. (2) Hide all features. (3) Show and hide individual feature groups. (4) Show and hide individual features.

Features sidebar

The features tab of the sidebar contains all available features and feature groups. Each feature or feature group can be hidden or shown by the toggle buttons next to it. The show and hide all buttons will switch all features on or off.


Help sidebar Fig. 7. Help sidebar. Shows the quick introduction.

Help sidebar

The help tab of the sidebar displays the help for the main visualisation.


About sidebar Fig. 8. About sidebar. ProViz description

About sidebar

The about tab of the options sidebar contains the about section describing ProViz and key features.




Databases and programs


Databases

ProViz utilises many databases to give the user a wide range of information about the protein of interest. Multiple sequence alignments are retrieved from Quest for Orthologues and GeneTree. Protein modularity data are provided by ELM, Pfam and Phospho.ELM. For structural information, PDB, DSSP and homology models from SWISS-MODEL are used. Genomics data are retrived from DbSNP, 1000 genomes and isoforms from UniProt. Additional curated data are available from UniProt and Switches.ELM.

Predictions

ProViz includes data from various predictive programs and databases, like conservation, ELM, MobiDB, IUPred, PsiPred and Anchor.

Data sources table

Name Description PMID URL
Multiple sequence alignments
GeneTree Homo/Para/orthologue alignments and gene duplication information 19029536 www.ensembl.org
GOPHER Orthologue alignments by reciprocal best hit 17576682 bioware.ucd.ie
Quest for orthologues Datasets of homologous genes 18819722 questfororthologs.org
Protein modularity
ELM Manually curated linear motifs 26615199 elm.eu.org
Pfam Functional regions and binding domains 24288371 pfam.xfam.org
Phospho.ELM Experimentally verified phosphorylation sites 21062810 phospho.elm.eu.org
Structural information
PDB Experimentally resolved protein tertiary structures 10592235 www.rcsb.org
DSSP Secondary structure derived from PDB tertiary structures 25352545 swift.cmbi.ru.nl/gv/dssp
Homology models/ SWISS-MODEL Assigned tertiary structure by sequence similarity to resolved structure 24782522 swissmodel.expasy.org
Genomic data
DbSNP Single-nucleotide polymorphism with disease association and genotype information 11125122 www.ncbi.nlm.nih.gov/SNP
1000 genomes Single-nucleotide polymorphism 23128226 www.1000genomes.org
Isoforms Alternative splicing 25348405 www.uniprot.org
Additional curated data
Mutagenesis Experimentally validated point mutations and effect 25348405 www.uniprot.org
Regions of interest Experimentally validated functional areas 25348405 www.uniprot.org
Switches.ELM Experimentally validated motif-based molecular switches 23550212 switches.elm.eu.org
Prediction
MobiDB Collection of various disorder prediction methods 25361972 mobidb.bio.unipd.it
IUPred Intrinsically disordered regions 15769473 iupred.enzim.hu
PsiPred Secondary structure for human proteins 23748958 bioinf.cs.ucl.ac.uk/psipred
Anchor Binding sites in disordered regions 19412530 anchor.enzim.hu
ELM Linear motifs by regular expression 26615199 elm.eu.org
Conservation Conservation of residues across the alignment 22977176 bioware.ucd.ie

PDF download


The PDF download is available by buttons in the options sidebar or the toolbar on the top right. This creates a file download containing a PDF document of the ProViz visualisation.


URL options


Users can construct URLs to access and customise protein visualisations using the URL options below:

URL options table

URL option Description Input type Example
uniprot_acc UniProt accession of the protein to be visualised string: UniProt accession uniprot_acc=P46527
alignment Type of homology alignment to be displayed string: QFO,[TaxonID] alignment=Metazoa
ali_start Starting residue for the scope of the alignment integer > 0 ali_start=10
ali_end Ending residue for the scope of the alignment. Will be set to protein length, if greater than protein length. integer > 0 ali_end=20
disable Disables feature tracks by providing the names of features. This prevents loading of data for mentioned features, they can’t be activated without reload of the page. string: motif, elm, modification, phospho, mutagenesis, pfam, structure, PDB, homology, splice_variant, SNP, chain, dna_binding, region, metal_binding, site, cross_link, iupred disable=motif,modification,SNP
collapse Collapses feature groups by providing the names of feature groups. Features are loaded and hidden, but can be activated in the options panel. string: alignment, switch, motif, modification, mutation, structure, PDB, isoform, snp, feature, disorder collapse=alignment,PDB
hideAln Hide proteins by providing accessions separated by commas string: UniProt accession" hideAln=H2Q5H2,F6Z4RO
showAln Show proteins by providing accessions separated by commas. All other sequences will be hidden string: UniProt accession showAln=H2Q5H2,F6Z4RO
genetree_mode If alignment is set to GeneTree select paralog, ortholog or all string: paralog, ortholog, all genetree_mode=paralog
url_rest Providing a URL pointing to a custom track file will load the visualisation for the custom data automatically string: file URL rest_url=http://slim.ucd.ie/proviz/help/custom_track/track.xml

URL example

http://slim.ucd.ie/proviz/proviz.php?uniprot_acc=P46527&alignment=33208&disable=PDB&ali_start=140&ali_send=198
Shows the region of p27 between residues 140 and 198, and turns off the PDB track.


Use of custom data


Users are able to add custom data to any existing ProViz visualisation. This can be achieved by providing a file in either XML, CSV or JSON format by drag and drop, file upload or link to a server providing the file via REST service.

XML

The XML file has to start and end with the "tracks" tag. Each track starts and ends with the "track" tag and has the mandatory option "type" (feature, peptide, histogram) and accepts the options "name", "type", "position", "colour" and "opacity". The "track" tag accepts multiple "entry" tags. The "entry" tag requires the "start" and "end" options for type feature or peptide or "position" option for histogram, but also accepts "text" for feature, "value" for histogram and "sequence for peptide, "hover", "link", "text_colour", "colour" and "opacity". An example of the format is available at: XML

XML schema

An XML schema can be used to design and validate a ProViz readable XML file and is available for download here: XML schema


CSV

The CSV file consists of two parts, the header line and the data lines which both have to be comma separated. The header line specifies the available fields and is mandatory, but the fields can be in any order. The data lines contain the data to be visualised and each line represents one element called entry. Fields starting with "t_" are track fields and only have to be defined once and will automatically be applied to all other lines with the same track number. Each track requires the "track", "t_type" and "t_position" fields, all other track fields are optional. The remaining fields define the shown element (entry fields). Required entry fields are "entry", either "start" and "end" or "position" as well as one of "text", "value" and "sequence". An example of the format is available at: CSV

CSV options

Field name Description Input type
track Numbering of tracks integer > 0
t_name Track name string
t_type Track type "string: feature, peptide, histogram"
t_position Track position, -1 for track display above the main sequence and alignment, 1 for track display above the main sequence and alignment "integer: -1 / 1"
t_colour Default colour of all elements in the given track "string: #000000 - #FFFFFF"
t_opacity Default opacity of all elements in the given track "double: 0 - 1"
t_text_colour Default colour of the text of all elements in the given track "string: #000000 - #FFFFFF"
t_help Tracks help tooltip string
entry Numbering of entries integer > 0
text Entries displayed text. For feature tracks only. string
value Value determining the height of the bars in a histogram. For histogram tracks only. double
sequence A peptides sequence. For peptide tracks only. string
hover Tooltip for given entry string
link Link activated by clicking the given entry "string: URL"
start Start position of entry. Only for feature and peptide tracks. Needs end value integer > 0
end End position of entry. Only for feature and peptide tracks. Needs start value. integer > 0
position Position of entry. Only for histogram tracks. integer > 0
colour Color of given entry. Overrides “t_colour” property. "string: #000000 - #FFFFFF"
opacity Opacity of given entry. Overrides “t_opacity” property "double: 0 - 1"
text_colour Text colour of given entry. Overrides “t_text_colour” property "string: #000000 - #FFFFFF"


JSON

The JSON file contains a standard JSON object. The object is an array filled with one dictionary per track. These dictionaries contain the "type", "position", "name", "colour", "opacity", "text_colour" variables which represent the default values for all entries in the track and the "data" array filled with one dictionary per entry. The entry dictionary provides the variables "start", "end", "text", "link", "colour", "opacity", "hover" and "text_colour" to customise the entry further. An example of the format is available at: JSON

JSON options

Field name Description Input type
name Track name string
type Track type "string: feature, peptide, histogram"
position Track position, -1 for track display above the main sequence and alignment, 1 for track display above the main sequence and alignment "integer: -1 / 1"
colour Default colour of all elements in the given track "string: #000000 - #FFFFFF"
opacity Default opacity of all elements in the given track "double: 0 - 1"
text_colour Default colour of the text of all elements in the given track "string: #000000 - #FFFFFF"
help Tracks help tooltip string
text Entries displayed text. For feature tracks only. string
value Value determining the height of the bars in a histogram. For histogram tracks only. double
sequence A peptides sequence. For peptide tracks only. string
hover Tooltip for given entry string
link Link activated by clicking the given entry "string: URL"
start Start position of entry. Only for feature and peptide tracks. Needs end value integer > 0
end End position of entry. Only for feature and peptide tracks. Needs start value. integer > 0
position Position of entry. Only for histogram tracks. integer > 0
colour Color of given entry. Overrides “t_colour” property. "string: #000000 - #FFFFFF"
opacity Opacity of given entry. Overrides “t_opacity” property "double: 0 - 1"
text_colour Text colour of given entry. Overrides “t_text_colour” property "string: #000000 - #FFFFFF"