Critical Assessment of Fully Automated Structure Prediction


CAFASP2 RULES AND INFORMATION

ANNOUNCEMENT NO. 3. This announcement overrides our previous two announcements.

This announcement contains new rules and modifications of the procedures announced in our previous announcement. The main differences are:

  • 1. There will be two types of submissions for each target:
    • 1. Available and registered servers.
    • 2. Late-comers and unavailable servers.
  • 2. The main evaluation will be carried out only on the type 1 submissions.
  • 3. The evaluation will be carried on 90% of the targets.
  • 4. The deadline for submitting the formatted model has been changed from two weeks, as previously stated, to the regular casp4 deadline.

In what follows is the complete CAFASP2 announcement, containing both the rules that have not changed and those that have been modified.

GOAL

The goal of CAFASP2 is to evaluate the performance of fully automatic structure prediction servers available to the community. In contrast to the normal CASP procedure, CAFASP2 will answer the question of how well servers do without any intervention of experts, i.e. how well ANY user can predict protein structure. As in CAFASP1, CAFASP2 assesses the performance of methods without the user intervention allowed in CASP.

CAFASP2 will take place as an official section of CASP4 , which will be held by mid-2000. All developers of automatic prediction servers are invited to participate in CAFASP2. Interested parties are invited to send an e-mail to dfischer@cs.bgu.ac.il so they will be included in the CAFASP2 mailing list. Predictors wishing to include their servers in CAFASP2 should register as soon as possible at the CAFASP2 meta-server.

TERMINOLOGY

A "server" is the automated structure prediction server that participates in CAFASP. A server will have an identification name and its corresponding url. Servers for any aspect of structure prediction can participate in CAFASP.

A "server's person" is the person in charge of the maintainance and function of a server.

A "raw server output" is the output obtained by a server.

A "server's submission" is the prediction that a server's person files to CAFASP based on the raw server output (and is not necessarily the same as the raw server output; see below).

REGISTRATION TO CAFASP2

A server's person willing to include his/her server to CAFASP2 must register as a predictor to CASP4, and state that this registration corresponds to his/her server. The server's person will receive a participant's identification. A server's person can also register as a regular CASP4 participant. Servers that run different methods should register each of the methods separately.

In addition, servers intended to participate in CAFASP need also to register at the CAFASP site , and recieve an explicit acknowledgment that the server has been included in the list of CAFASP participants. Only those servers that have registered at both the Prediction Center and at the CAFASP center will be evaluated.

CAFASP2 SUBMISSION PROCEDURE

As each target sequence is released, the CAFASP meta-server will submit it to all of the prediction servers and archive the raw server output (in the servers' native formats). In addition, each server's person will be responsible to reformat their archived results into CASP4 format and to submit them to CASP4 by the standard procedures.

We encourage (but do not require) that server persons make their best efforts to have their raw server output as close to the casp4 format as possible.

Only the submitted entries in the CASP4 format will be evaluated, but these will be compared with the archived results to ensure that the content is identical. Any discrepancies will be publicly announced and will result in disqualification.

Notice that in the above procedure, the server's person is responsible to submit the prediction to CASP4, using the regular CASP4 procedure. This submission must be identical in content to the output produced by his/her server, but may vary in format. Because it may be difficult to enforce that a server produces valid CASP4 formats, we allow a server's person to take the raw server output and transform it to valid CASP4 format, as long as it is identical in content. As the raw servers' output of the registered servers is collected, it will be made available to all, in the CAFASP2 web-site. This will allow anybody to use the predictions from the automated servers for other purposes.

Let's repeat the above: Upon release of a target by the casp4 people, the cafasp2 metaserver submits it to every registered server. The metaserver compiles the raw results of each server and stores them. The compiled results of the meta-server will be available to all.

Then the "server person" can look at what the cafasp2 metaserver has collected from his/her server, and produce by whatever means he/she wants, a casp4 compatible format which should be identical in content to the info stored at the meta-server. It is the server person's responsibility to submit the formatted prediction directly to casp4. The validation process will check that the formatted submissions are identical in content to the ones stored at the metaserver.

SUBMISSION DEADLINES

The CAFASP deadline for receiving the automated "raw-output" from the servers will be 48 hours after the target was sent.

The CAFASP deadline for submission of the formatted server's submission will be the same as the regular CASP4 deadline. Notice that the server's person is responsible to format the prediction, to submit it to CASP4 using the regular procedure. Notice that in addition to the formatted submission, the CAFASP meta-server will collect the results of each registered server immediately after the publication of each casp4 target, and that the formatted submission need be identical in content to the stored results. The automated results can be in any format, as long as they contain all the information needed to verify that the submitted prediction is identical in content to the automated results. In addition, ALL the results collected by the CAFASP meta-server will be made immediately available to the public.

SUBMITTING TO THE PREDICTION CENTER AND TO THE CAFASP CENTER

As stated before, the "raw-server output" will be collected by the CAFASP metaserver, and no human will intervene in this process. However, the valid-format CASP4 submission file must be prepared by the server-person. This file needs to be submitted to the Prediction Center by the CASP4 deadline. Submissions must be identical in content to the raw-server output stored at CAFASP, otherwise they will be disqualified. Only those submissions submitted at both centers will be evaluated.

TYPES OF SUBMISSIONS

For each target there will be two main types of submissions:

  • 1. AVAILABLE AND REGISTERED SERVERS: Those registered servers for which the CAFASP2 meta-server was able to compile their raw-output within the 48 hours after each target is released.
  • 2. LATE-COMERS/UNAVAILABLE SERVERS: Those servers that did not succeed to send their raw-output to the CAFASP2 meta-server within the 48 hours.
For the main CAFASP evaluation, only predictions of type 1 will be considered. Validation of the formatted submissions will only be carried out for predictions of type 1. The type 2 predictions are aimed at servers that for any reason were not available within the 48 hours deadline, and thus no raw-output could be collected. Predictors submitting in this category implicitly state that the prediction reflects a fully automated process, but this will not be validated by CAFASP. An additional CAFASP evaluation will be carried out for predictions of type 2, based on the manually prepared, formatted submissions. Because these can not be considered fully automated predictions, they will not be validated.

UN-REGISTRATION

Any cafasp participant can ask to withdraw his/her participation at any time, which will mean that his/her server's url will be removed from cafasp's meta-server.

CATEGORIES FOR CAFASP2

Any automated server can participate in CAFASP2, including Homology Modeling, Threading, Ab-Initio, Secondary Structure Prediction and Contacts Prediction.

ASSESSMENT OF PREDICTIONS

Predictions submitted to CAFASP will undergo the exact same evaluation procedure as the normal CASP submission. In addition, the CAFASP sub-committees may decide to apply additional evaluation procedures. The latter will include only automatic evaluators and the measures used by them will be made public as soon as possible. More details about the additional cafasp evaluation are available here . A comparison of the performance of CAFASP2 versus the regular CASP4 submission will also be carried out.

The additional automatic evaluation of the CAFASP2 results is another difference between CASP and CAFASP. The automatic evaluation has the advantages of being reproducible, quantitative and objective. However, as it is difficult to agree upon a single evaluation measure, we encourage all server persons to understand the evaluation methods BEFORE they register, and if they find them inadequate, they may choose not to participate in CAFASP. The automatic evaluation frees CAFASP2 from the "assessment" problem, as it will be a program (described in advance) that will do the rankings. Please check this site soon to see the description of the evaluation methods.

CAFASP COMMITTEES

Currently the people involved in the CAFASP committee are Leszek Rychlewski, (leszek@bioinfo.pl), Arne Elofsson (arne@razor.biokemi.su.se), Burkhard Rost (rost@columbia.edu), Adam Zemla (adamz@llnl.gov), Krzysztof Fidelis (fidelis@llnl.gov), Naomi Siew (nomsiew@cs.bgu.ac.il) and Daniel Fischer (dfischer@cs.bgu.ac.il). As advisors we currently have Steven Brenner (brenner@compbio.berkeley.edu). Sub-committees for homology modeling, threading, ab-initio and secondary structure servers will be coordinated by the CAFASP committee. The Sub-committees will be in charge of the additional automated evaluation (if required) and of the comparative analysis of the results.

As of today these are the sub-committees and people assigned to them:

Threading sub-committee: Leszek Rychlewski (leszek@bioinfo.pl), Arne Elofsson (arne@razor.biokemi.su.se) and Daniel Fischer (dfischer@cs.bgu.ac.il).

Ab initio sub-committee: Angel Ortiz (ortiz@scripps.edu)

Homology Modeling sub-committee: Roland L. Dunbrack (RL_Dunbrack@fccc.edu)

Secondary Structure sub-committee: Burkhard Rost (rost@columbia.edu) and James Cuff (james@ebi.ac.uk).

Contacts Prediction sub-committee: Alfonso Valencia (valencia@cnb.uam.es).

We invite interested parties to be involved in the various sub-committees.

CAFASP2 url: http://www.cs.bgu.ac.il/~dfischer/CAFASP2