Accessibility navigation


The ModFOLD server version 9.0 help page

This page contains simple guidelines for using the new version of the ModFOLD server, sample input data which may be downloaded and submitted and examples of output from the server.

Guidelines for using the server

The ModFOLD server version 9.0 requires the amino acid sequence of your target protein and either a single 3D model file in PDB format or a tarball containing a directory of multiple separate model files in PDB format.

If you also decide to provide your email address then you will be sent a link to the graphical results and machine readable results when your predictions are completed. However, if you do not provide your email address then you must bookmark the results page and so that you may refer to the results when they become available. The job status for your target will be reported on the results page prior to job completion.

Required - Sequence Data


In the text box labelled "Input sequence of protein target" please carefully paste in the full amino acid sequence for your target protein in single letter format. An example sequence (CASP15 target T1104) is shown below:

Sample sequence:

	QLEDSEVEAVAKGLEEMYANGVTEDNFKNYVKNNFAQQEISSVEEELNVNISDSCVANKI
	KDEFFAMISISAIVKAAQKKAWKELAVTVLRFAKANGLKTNAIIVAGQLALWAVQCG
	
It is important that you provide the full sequence that corresponds to the sequence of residue coordinates in the model file. If your model does not contain numbering that corresponds directly to the order of residues in the sequence file then the server will attempt to renumber the residues in the model files accordingly. However, if there are residues in a model file that are not contained in the provided sequence then the prediction for that model will not complete.

Providing a reference sequence allows all submitted models (including partial models) to be compared fairly using the same calculated sequence-based input data. It is also useful for ensuring consistent residue numbering for all submitted models.

If you include a sequence ID header in FASTA format, then this line will be removed by the server. Additionally, the server will not accept any non-standard amino acid characters in the protein sequence submission box. If you wish to provide a short name for your job, then please use the box provided (see below).

Required - Model Data


Using the file selector labelled "Upload model/models" you may either upload a single PDB file (to obtain quality predictions for a single model), or multiple PDB files (to obtain quality predictions for many alternative models) in the form of a tarball (a tarred and gzipped directory).

Please ensure that each separate PDB file contains the coordinates for one model only. Please do not upload a single PDB file containing the coordinates for multiple alternative models. The coordinates for multiple models should always be uploaded as a tarred and gzipped directory of separate files.

The server will attempt to automatically renumber the ATOM records in each model in order to match the residue positions in the sequence i.e. the coordinates for the first residue in the sequence will be renumbered "1" in each model file (if they aren't already), the coordinates for the second residue in the sequence will be numbered "2", and so on.

Important note concerning incomplete models: If your model files contain many missing residues, e.g., models for discontinuous domains, then you will either need to provide the full length sequences, or you will need to renumber the ATOM records, so that the residues are sequential in accordance with the partial sequence. Please note that if you submit the full length sequence and there are a high number of missing residue coordinates in your models, then these will count against the models in the global scoring. In this case, the global scores may be more appropriate if you submit the partial sequences along with your correctly numbered model files, i.e., with sequential numbering of residue coordinates.

Sample PDB file:
An example file containing a single model for the sample sequence shown above can be downloaded below:

Example of model built for CASP15 target T1104: T1104TS434_2.pdb

Sample Tarball file:
The tarball should contain a directory of separate PDB files for your target sequence. This file should be similar in format to the tarballs of 3D models found on the CASP website.
An example tarball file containing multiple models for the sample sequence shown above can be downloaded below:

Tarball of multiple server models for CASP15 target T1104: T1104.tar.gz

Steps to produce a tarball file for your own 3D models:
Linux/OSX/other Unix
  1. Tar up the directory containing your PDB files e.g. type the following at the command line: tar cvf my_models.tar my_models/
  2. Gzip the tar file e.g. gzip my_models.tar
  3. Upload the gzipped tar file (e.g. my_models.tar.gz) to the ModFOLD server
Windows users
In Windows you can use a free application such as 7-zip to tar and gzip your models.
  1. Download, install and run 7-zip
  2. Select the directory (folder) of model files to add to the .tar file, click "Add", select the "tar" option as the "Archive format:" and save the file as something memorable e.g. my_models.tar
  3. Select the tar file, click "Add" and then select the "GZip" option as the "Archive format:" - the file should then be saved as my_models.tar.gz
  4. Upload the the gzipped tar file (e.g. my_models.tar.gz) to the ModFOLD server

Program selection


  1. ModFOLD9_rank - for ranking multiple models - global scores are optimised to place the very best quality models at the top of the list, but the predicted global scores are not necessarily a close a reflection of the observed scores
  2. ModFOLD9 - this method is the default option - this option gives balanced performance in terms of model ranking/selection and correlations between predicted and observed scores
  3. ModFOLD9_cor - this option gives optimised performance in terms of the correlation between predicted and observed scores, global scores more closely reflect reality

Optional - E-mail address


If you wish, you may provide your e-mail address. You will be sent a link to the graphical results and machine readable results when your predictions are completed.

Privacy Notice: Processing of personal data will be in accordance with the GDPR and University of Reading (UoR) Data Protection Policy. Users' IP addresses will be temporarily stored in the queuing system and then used to generate anonymous usage statistics. Optionally, users may provide an email address, so that they can be notified when their job completes; this will be deleted when no longer required. Personal data will be accessible only by UoR staff managing the server, and will not be stored for longer than is necessary for the provision of the service. Your results will be available via a unique URL, which will not be posted publicly. Your sequences and structures will not be used for any other purposes and will deleted after the expiry date (21-28 days).

Optional - Short name for sequence


If you wish, you may assign a short memorable name to your prediction job. This is useful so that you can identify particular jobs in your mailbox. This is particularly important because ReFOLD will not necessarily return your results in the order you submitted them. The set of characters you can use for the filename are restricted to letters A-Z (either case), the numbers 0-9 and the following other characters: .~_- The name you specify will be included in the subject line of the e-mail messages sent to you from the server.

Optional - Short name for sequence


If you wish, you may assign a short memorable name to your prediction job. This is useful so that you can identify particular jobs in your mailbox. This is particularly important because ModFOLD will not necessarily return your results in the order you submitted them. The set of characters you can use for the filename are restricted to letters A-Z (either case), the numbers 0-9 and the following other characters: .~_-

The name you specify will be included in the subject line of the e-mail messages sent to you from the server.

Output from the server


The ModFOLD server version 8.0 produces a results table containing numerical and graphical prediction results. The raw machine readable prediction data is also provided in CASP QA (QMODE2) format

Examples of output:
  1. ModFOLD9 results for CASP15 target T1104
  2. ModFOLD9 results for CASP15 target T1106s1
  3. ModFOLD9 results for CASP15 target T1109
  4. ModFOLD9 results for CASP15 target T1112
The results table is ranked according to decreasing global model quality score. The global model quality scores range between 0 and 1. In general, scores less than 0.2 indicate there may be incorrectly modelled domains and scores greater than 0.4 generally indicate more complete and confident models, which are highly similar to the native structure. If the global model quality scores are low, then the per-residue error plots can give you an idea of specific domains or regions in your protein that might be correctly modelled (green Xs will indicate where there are any missing residues in the model).

From the global scores we can calculate a p-value which represents the probability that each model is incorrect. That is to say, that for a given predicted model quality score, the p-value is the proportion of models with that score that do not share any similarity with the the native structure (TM-score < 0.2). Each model is also assigned a colour coded confidence level depending on the p-value:

P-value cut-offConfidenceDescription
p < 0.001CERTLess than a 1/1000 chance that the model is incorrect.
p < 0.01HIGHLess than a 1/100 chance that the model is incorrect.
p < 0.05MEDIUMLess than a 1/20 chance that the model is incorrect.
p < 0.1LOWLess than a 1/10 chance that the model is incorrect.
p > 0.1POORLikely to be a poor model with little or no similarity to the native structure.
The per-residue scores indicate the predicted distance (in Angstroms) between the CA atom of the residue in the model and the CA atom of the equivalent residue in the native structure. Thumbnail images of plots depicting the per-residue error versus residue number are included in each row in the results table. Each of the thumbnails links to a page that displays a larger view of the plot and contains a further link to download a PostScript version. Each row in the table also displays a thumbnail of the 3D cartoon view of the model which is colour coded with the residue error according to the RasMol temperature colouring scheme. Each small image also links to a page that shows a larger image of the 3D view and contains a link to download a PDB file of the model with residue accuracy predictions (Angstroms) in the B-factor column. The model is also loaded into JSmol for convenient interactive viewing of per-residue errors within the browser.

How long will I have to wait for my results?


The time taken for a prediction will depend on the length of sequence, the number of models submitted and the load on the server. For a new run on single model you should typically receive your results back within 24 hours, once your job is running. Large batches of models (several hundred) for a single target may take several days to process. If you have already submitted a model for the same target sequence within the same week, then the reference model library for that sequence will already be available to the server (the results will be cached) and so you will receive your results back much more quickly (within a few hours).

Fair usage policy


You are only allowed to have 1 job running at a time for each IP address, so please wait until your previous job completes before submitting further data. If you already have a job running then you will be notified and your uploaded data will be deleted. Once your job has completed your IP address will be unlocked and you will be able to submit new data.

Error reporting


Check the header of the machine readable results file (provided as a link at the top of the result page) for any errors that may have occured following file submission. Please email me for help if you encounter a persitent error.

Page navigation

 

Search Form