CAFASP1
 
CASP3: CACASP or CAFASP?

A paper describing CAFASP-1 is in Press in 1999's Special Issue of Proteins. The following tables from the paper are available here:

The submitted paper in latex.

What follows includes a general description of CAFASP, the rules of CAFASP1, the detailed results per group and target and other data.

One of the most important events in the protein structure prediction community is the Critical Assessment of Structure Prediction Meeting, which was held on December 1998.

As a complement to the enormous value of this event, here I propose to discuss the differences between the Critical Assessment of COMPUTER AIDED Structure Prediction (CACASP) and the Critical Assessment of FULLY AUTOMATED Structure Prediction (CAFASP).

In the current recipee of CASP, predictors submit their predictions using whatever techniques they choose; some report the exact results of fully automated methods without any changes; some "edit" the results from the programs and create a "computer aided" + human prediction; and some use mostly their brains with little aid from computers. Thus, the current recipee is assessing the performance of humans and does not allow for the strict assessment of the methods themselves. The latter is what ultimately non-expert users are interested in. Predictions in which human intervention has played a role can not always be objectively assessed, and in most cases, they are hard, if not impossible to reproduce by others. The value of the current critical assessment is thus not fully exploited.

What I propose is to take advantage of the invaluable framework provided by CASP3 and run in parallel, and independently a CAFASP1 evaluation. To this end, for each target in the fold recognition track, results from fully automated programs and servers will be compiled and made available thru the web to all, along with their assessment.



Registration to: dfischer@cs.bgu.ac.il, before 12.31.98.


Servers registered:
1. 3D-PSSM (Sternberg) sternber@icrf.icnet.uk
2. Karplus karplus@cse.ucsc.edu
3. frsvr (Fischer) dfischer@cs.bgu.ac.il
4. pscan (Eloffson) arne@bimbo.biokemi.su.se
5. BASIC (Godzik) adam@scripps.edu
6. GenTHREADER jones@globin.bio.warwick.ac.uk
7. Valentina di Francesco valedf@tigr.org
8. TOPITS (Rost) Burkhard.Rost@EMBL-Heidelberg.de
9. Bork
plus another PSI-BLAST servers that we will add.

Others who have expressed interest:
Other servers that did not register: 123D, H3P2

Deadline for submissions:    1.8.99
                                               Should have been as in CASP3, but we missed it, so it
                                               should be immediate, but we cant, so it
                                               should be next week, but it is christmas, so it
                                               should be the following week, but it is new year, so
                                               we will settle for 1.8.99 until 11:59 PM (GMT).

                Any server not ready by that date will be able to participate in CAFASP2.



Assessment of predictions:   1.15.99. The assessment results will be published in this url.


Submission procedure:  For each target, each participant will submit to dfischer@cs.bgu.ac.il the results of
                                      his/her own server in the following ascii format:

 SERVERNAME:   xxxx
 TARGET:               T00xx
 PARAMETERS:     DEFAULT/ others
 URL:                      local url where (i) an exact copy of this submission and (ii) the exact server's
                               output will be kept
 SERVER'S URL:   url where submissions to this server are accepted

followed by 10 to 15 lines each containing 2 mandatory columns and 3 optional ones:

RANK  FOLD  SCORE  LENGTH_ALIGNMENT   SEQ_ID%

after this list, the keyword END must appear. After the END, the predictor can include the sequence-structure alignments in whatever
format. See example .

If parameter setting is required, the parameters must be described. However, the same
setting must be used for all targets!

If the SCORE column is provided, a "documentation" line explaining the reliability of the
scores can be included such as: "scores in the range 3-3.5 are 70% of the time correct".

To help in the assessment process and to avoid future misunderstandings, each submission can be acompanied by an "auto-assessment" which will say at which ranks the predictor believes he got the correct fold. If the predictor doesn't know, then he/she can mention this.Obviously, the assessment procedure will not use this annotation; this may only be of help in defining our list of correct hits, which for some cases it is not easy to determine.

To compile information on alignments, we encourage predictors to include in their submissions the alignments for each top hit in the exact casp3 format. We may eventually use this data to assess alignment quality. However, we will not enforce this at this time.

Notice that for those targets for which the structural match comprises only parts of the sequence, and for which we can define this unambiguously, it is required that the server be run using the partial sequence. For now we are requiring to submit in addition the full target sequence. We will see later how to account for these targets.

It may be the case that for some of the targets below we won't be able to define an unambiguous structural match. At this point it is encouraged to submit all of them, but it might happen that we will remove from the assessment a few of them.



Submit to: dfischer@cs.bgu.ac.il by 1.8.99 and publish in local url for reference. As soon as a submission is received, I will add it to the table of results (see below).


Validation of submissions: Each participant will be assigned two "policemen" which will be responsible
   to validate that the submission exactly corresponds to the output of the server. Inconsistencies will be
   reported to all participants, and a vote will be carried out to decide whether the submission should be
   disquailified. The predictor will have a chance to correct his submission once. A disqualified target will receive a penalty of 3 points. Police will be in effect until the proofs of the    paper are returned.



Writing the paper: Each active participant who has submitted predictions,  has acted as policeman for
  at least two other servers, and has been active in assessing the results, will have the right to become a co-author.
After the assessment results are published on 1.11.99, a draft will be distributed among co-authors for
comments/suggestions, to be received by 1.25.99. The ms will be submitted on 1.30.99.

The best servers will in addition have the right to write one paragraph of 100-150 words describing
how their automated results differ from the CASP3 submissions, what was done differently for CASP3,
both for good or for bad, etc.

Any participant may request to be removed from this experiment with or without explicit mention.



Assessing procedure:  The goal of CAFASP1 is to experiment on assessment of automated methods and
 to publish the result by 1.1999. Thus, there is no time in devising complicated procedures for assessment.
For CAFASP1 we will only make a "CASP1-like" type of assessment, i.e. was the correct fold identified
within the first 10-15 hits. NONE, alignment quality are not taken into account.

Cafasp1 is a fully democratic event. All rules and decisions will be voted for. Exact rules will be known to all before the deadlines. Exact dates will be published with time. If delays appear, we will inform all. All programs developed for cafasp1 will be available here to all. All submission and assessment data will be available to all at all times.

Targets for which a non-ambiguous scop fold can be assigned will be considered. A correct hit is one that
has the same scop classification (first 2 numbers). ( see D. Jones alternative suggestion ). The "correct" hits were decided using some data D. Jones compiled at: ftp://ribosome.bio.warwick.ac.uk/pub/CAFASP1/comparisons. For a correct hit at rank i, 1/i points will be credited.
As more than one correct hit can appear in the list, only the hit at the highest rank will be considered.

A server may execute more than one single method, but no more than 7. A method is defined as a single
set of parameters and programs, the same for all targets.

A table with the performance is the result of this experiment. Its columns will be the targets, its rows the
methods used. The results are divided into 5 categories: 1) targets with members at the family level (easy, homology modeling targets), 2) targets with members at the superfamily level and with an unambiguously identifiable fold, 3) targets with members at the fold level and with an unambiguously identifiable fold, 4) targets with no unambiguosly identifiable fold or with similarities beyond current capabilities of structural comparison and/or fold recognition methods and 5) targets with yet unsolved structure (still blind predictions).
 
 

TARGETS WITH MEMBERS AT THE FAMILY LEVEL
T0055 T0057 T0068 T0070 T0062 TOTAL
frsvr_SDP 1 1 1 1 1 5
frsvr_SDPMA 1 1 1 1 1 5
Karplus1 1 1 1 1 1 5
Karplus2 1 1 1 1 1 5
Karplus3 1 1 1 1 1 5
Topits 1 1 0 2 1 3.5
GenThreader 1 1 1 1 1 5
3D-PSSM 1 1 2 1 1 4.5
(1D+3D)-PSSM 1 1 1 1 1 5.0
pscan 1 1 0 1 1 4
BASIC 1 1 1 1 1 5
frsvr_SDPMA2 1 1 1 1 1 5
PSINCBI 1 1 1 1 5
PSIBork 1 1 1 1 1 5
total 14 14 10.5 12.5 12
OTHER
LATE ARRIVALS SHOWN BUT ONLY PARTIALLY EVALUATED:
H3P2 1 2 0 0 7 1.64
In each entry the number of points are shown followed by a normalized number which is computed as
points squared divided by total number of points in target.

TARGETS WITH MEMBERS AT THE SUPERFAMILY LEVEL (AND WITH AN UNAMBIGUOUSLY IDENTIFIABLE FOLD):

T0074 T0081 T0083 T0063 T0053 T0044 T0054 T0085 T0080 TOTAL
frsvr_SDP 1 0 3 10 2 6 0 2 0 2.60 / 24
frsvr_SDPMA 1 3 1 2 3 4 0 15 0 3.48 / 49
Karplus1 1 4 1 8 23 5 0 10 0 2.72 / 26
Karplus2 1 0 1 0 0 1 13 6 0 3.24 / 49
Karplus3 1 12 1 10 16 1 9 6 0 3.52 / 52
Topits 1 0 0 0 1 0 0 0 0 2.00 / 25
GenThreader 1 1 1 0 1 0 0 1 0 5.00 / 95
3D-PSSM 4 1 0 0 1 0 0 0 0 2.25 /
(1D+3D)-PSSM 2 1 0 0 1 0 0 0 0 2.50 /
pscan 3 3 1 0 0 0 0 0 0 1.67 / 16
BASIC 1 0 1 0 0 1 0 8 0 3.12 / 47
frsvr_SDPMA2 1 10 2 3 1 3 0 1 6 4.43 / 90
PSINCBI 1 0 0 0 0 0 0 0 0 1.00
PSIBork 1 0 0 0 0 0 0 0 1.00 / 09
total 11.67 4.10 7.83 1.16 5.94 3.95 0.39 3.13 0.17
OTHER:
H3P2 23 0 19 0 1 N.A. 9 0 0 1.26

TARGETS WITH MEMBERS AT THE FOLD LEVEL:

T0046 T0071 T0043 T0067 T0059 TOTAL
frsvr_SDP 2 0 0 10 14 0.67 / 04
frsvr_SDPMA 2 0 0 3 6 1.00 / 14
Karplus1 1 0 1 0 16 2.06 / 71
Karplus2 1 10 0 0 0 1.10 / 12
Karplus3 1 6 3 16 16 1.62 / 22
Topits 1 0 3 0 0 1.33 / 17
GenThreader 1 9 0 0 0 1.11 / 12
3D-PSSM 1 0 0 4 0 1.25 /
(1D+3D)-PSSM 1 0 0 3 0 1.33 /
pscan 1 0 0 0 0 1.00 / 10
BASIC 1 4 0 3 0 1.58 / 26
frsvr_SDPMA2 2 0 0 6 8 0.79 / 07
PSINCBI 0 0 0.00 / 00
PSIBork 0 0 0 0 0.00 / 00
total 9.75 0.63 1.67 1.83 0.49

TARGETS WITH DOMAIN DEFINITIONS IN THE ABOVE TABLES, FOR WHICH WE SUBMIT THE INDIVIDUAL DOMAINS

T0083.1 T0063.1 T0063.2 T0071.1 T0071.2 T0079.1 T0079.2 TOTAL
frsvr_SDP 1 0 6 6 0 1 2 2.83 / 26
frsvr_SDPMA 1 2 1 5 11 1 1 4.78 / 82
PSIBork 0 0 0.00 / 00
Karplus1 1 0 9 0 0 1 3 2.44 / 22
Karplus2 1 0 0 10 0 1 1 3.10 / 34
Karplus3 1 0 10 5 0 1 1 3.30 / 36
Topits * 0 * 0 * 0 * 0 * 0 * 0 * 0 0.00 / 00
GenThreader 1 0 8 2 0 1 6 2.79 / 35
PSINCBI
3D-PSSM 3 0 1 6 0 4 4 2.00 / 26
(1D+3D)-PSSM 3 9 1 4 9 1 1 3.81 / 53
pscan 1 0 0 13 0 1 0 2.08 / 20
BASIC 1 0 0 0 2 1 1 3.50 / 70
frsvr_SDPMA2 1 2 1 10 0 1 1 4.60 / 79
total 10.00 1.42 4.50 1.99 0.71 10.25 7.08

TARGETS WITH UNSOLVED STRUCTURE:

T0045 T0051 T0072 T0078
frsvr_SDP ? 1be1 7.0 ; 1ldn 4.9; 1sha ? 1min 4.0 ; 1grl 3.3 ; 1ktq ?1aok 4.4 ; 1knt 4.0 ; 1vvc ?1rhd 4.3 ; 8adh 4.1 ; 1tmf
frsvr_SDPMA ? 1be1 5.3 ; 3chy 4.4 ; 3grs ?1min 3.9 ; 1tpt 3.4 ; 2dld ?1knt 3.7 ; 2hpp 3.6 ; 1bhp ?1def 3.9 ; 1ab2 3.6 ; 3gap
PSIBork 0 0/0 0/0 0/0
Karplus1 ? 1smn -4.6; 4pgm -4.6 ; 3pgm ? 1req -7.7 ; 1etu -6.1 ; 1aip ? 1ahj -5.2 ; 1vvc -5.0 ; 1bak ? 1chk -5.3 ; 1ps1 -4.7 ; 1nqb
Karplus2 ? 1apm -6.2; 1atp -6.2 ; 2cpk ? 1svb -6.3 ; 1iag -5.1 ; 1aei ? 1ktx -5.1 ; 2ktx -5.1 ; 1rfs ? 2fbj -6.2 ; 1cjl -5.3 ; 1aw8
Karplus3 ? 2mev -6.0; 1apm -5.9 ; 1aqz ? 1etu -8.0 ; 1efm -8.0 ; 1req ? 1vcc -8.1 ; 2ktx -7.0 ; 1ktx ? 1ryt -6.7 ; 1ctj -6.0 ; 1chk
Topits ? 1hmy 2.5; 1pii 2.5; 3pfk ?1rlr 4.7 ; 1req 3.5 ; 1sly ? 1idk 3.3 ; 1bcp 2.9 ; 1kit ? 1pys 2.6 ; 1nfk 2.5 ; 1eft
GenThreader ? 1pva .52; 1sct .51; 1rtp ?1pkp .08 ; 1mio 0.07 ; 1cdc ? 1kpt .20 ; 1esl .16 ; 2pld ? 1lyb .13 ; 1lmk .09 ; 1lht
PSINCBI ?2hsd 9.8 0
3D-PSSM ? 1dup 5.9; 1etp 8.3; 1fcd ? 1sly 0.6; 1req 0.8; 1req ? 1fbr 0.5; 2psp 2.1; 1whp ? 1lmw 3.8; 1tbg 4.2; 1aij
(1D+3D)-PSSM ? 1wdc 5.6; 1fcd 5.8; 1fcd ? 1req 0.5; 1sly 0.7; 1req ? 1fbr 1.4; 1dan 6.8; 1nfk ? 1dru 2.0; 1amp 2.2;1tib
pscan ? 4cpv 4.9; 1sha 3.5; 1aep ? 1ipd 2.8 ; 6xia 2.8 ; 1vsg ? 2abx 5.3 ; 1fkf 4.2 ; 1gat ? 1hlb 3.4 ; 1cse 3.0 ; 1fha
BASIC ? 1rss 4.5; 1hus 4.3; 1pvi ? 1ydv 6.1 ; 1lgr 5.9 ; 1ypi ? 1ctl 5.3 ; 1qli 5.2 ; 1fle ? 1hvq 4.8 ; 1cnv 4.6 ; 1bpb
frsvr_SDPMA2 ? 1be1 4.5; 1scu 3.7; 1wab ?1min 3.6 ; 1fok 3.2 ; 1taq ? 2hpp 4.0 ; 1hfh 3.6 ; 1knt ? 1rhd 4.4 1lap 3.7 ; 1sva

TARGETS WITH NEW FOLDS
T0052 T0056
frsvr_SDP HIGHEST SCORE: 4.52 HIGHEST SCORE: 3.70 MAX SCORE: 4.52
frsvr_SDPMA HIGHEST SCORE: 4.30 HIGHEST SCORE: 3.05 MAX SCORE: 4.30
PSIBork
Karplus1 HIGHEST SCORE: -4.06 HIGHEST SCORE: -3.85 MAX SCORE: -4.06
Karplus2 HIGHEST SCORE: -162.05 HIGHEST SCORE: -5.64 MAX SCORE: -162.05
Karplus3 HIGHEST SCORE: -7.18 HIGHEST SCORE: -6.53 MAX SCORE: -7.18
topits HIGHEST SCORE: 3.09 HIGHEST SCORE: 2.46 MAX SCORE: 3.09
GenTHREADER HIGHEST SCORE: 0.420 HIGHEST SCORE: 0.365 MAX SCORE: 0.420
PSINCBI
3D-PSSM HIGHEST SCORE: 2.99 HIGHEST SCORE: 6.09 MAX SCORE: 2.99
(1D+3D)-PSSM HIGHEST SCORE: 6.36 HIGHEST SCORE: 6.44 MAX SCORE: 6.36
pscan
BASIC HIGHEST SCORE: 25.6 HIGHEST SCORE: 4.6 MAX SCORE: 25.6
frsvr_SDPMA2 HIGHEST SCORE: 3.57 HIGHEST SCORE: 3.20 MAX SCORE: 3.57

TARGETS WITH MEMBERS AT THE FOLD LEVEL BUT IMPOSSIBLE TO FIND OR EVALUATE (OR NO HITS BY VAST):

T0061 T0075 T0077 T0079 TOTAL
frsvr_SDP 1/1/0.00
frsvr_SDPMA 1/1/0.??
PSIBork 0 / 0
Karplus1 1/1 / 0.??
Karplus2 1/1 / 0.??
Karplus3 1/1 / 0.??
Topits 0 / 0.00
GenThreader 1/?/0.?? 1/?/0.?? 1/?/0.?? 1/1/0.??
PSINCBI
3D-PSSM ?/0.?? ?/0.?? ?/0.?? 1/7 / 0.09
(1D+3D)-PSSM 1/?/0.?? 1/?/0.?? 1/?/0.?? 0/0.00
pscan ? / ?.00 ? / ?.00 ? / ?.00 1/1/0.??
frsvr_SDPMA2
total

OTHER TARGETS (NOT EVALUATED)?
T0084 T0064 T0065 T0079.1.1 T0079.2.1
frsvr_SDP
frsvr_SDPMA
GenTHREADER 1/1/0.?? 1/1/0.?? 1/?/0.??
(1D+3D)-PSSM 1/3/0.?? 1/?/0.??
BASIC 1/2/0.??



The CAFASP1 targets:
 
 

TARGETS WITH MEMBERS AT THE FAMILY LEVEL (<30% Seq. Id.)

TARGET CORRECT SCOP ID. (a pdb example)
t0055 4.97.1 (1esl)
t0057 4.41.1 (1gd1o)
t0068 2.56.1 (1rmg)
t0070 6.7.1 (2por)
t0062 NOT SOLVED YET (2cnd)

TARGETS WITH MEMBERS AT THE SUPERFAMILY LEVEL:

TARGET CORRECT SCOP ID. (a pdb example)
t0074 1.34.1 (3ctn) scop list
t0081 (1jdbe) scop list
t0044 4.35 (1eps,1naw,1a2n,1uae) scop list
t0083 1.30 (1lmb3) scop list
t0054 4.31 (1lbu, 1vhh) scop list
t0053 3.72 (1ak1) scop list
t0063 2.26 (1ah9) PLUS SH3 scop list
t0085 1fgja scop list
t0080 3.46 1fmta scop list

TARGETS WITH MEMBERS AT THE FOLD LEVEL AND/OR WITH PARTIAL TOPOLOGY: (TARGETS FOR WHICH AN UNAMBIGUOUS SCOP ASSIGNMENT CAN NOT BE MADE WILL NOT BE ASSESSED AT THIS POINT!)

TARGET CORRECT SCOP ID. (a pdb example) IF DOMAIN, FROM-TO
t0046 2.1 (3hhrc) scop list
t0071 2.1 (1a2yb) PLUS 4.74 (1bv1) - also perhaps 4.61 (3nul) scop list
t0067 2.1 (1ttg, 1ajw) scop list
t0043 4.33 (1npk) scop list
t0059 2.21 (1vie) scop list

TARGETS WITH DOMAIN DEFINITIONS IN THE ABOVE TABLES, FOR WHICH WE SUBMIT ALSO THE DOMAIN(S)

TARGET CORRECT SCOP ID. (a pdb example)
t0083.1 1.30 (1lmb3) scop list 1-105
t0063.1 SH3 (1vie) 1-70 scop list
t0063.2 2.26 (1ah9) scop list 71-138
t0071.1 2.1 (1a2yb) scop list 1-121
t0071.2 4.74 (1bv1) - also perhaps 4.61 (3nul) scop list 122-238
t0079.1 1-75 HTH scop list
t0079.2 55-124 HTH scop list

TARGETS THAT ARE IMPOSSIBLE TO FIND (NO VAST HIT OR SIMPLY NOT SUITABLE FOR FOLD RECOGNITION):

TARGET CORRECT SCOP ID. (a pdb example)
t0061 1pysb_400-474???? VAST: 20 residues out of 75: TALIGN T0061 0 32 40 1PYS B 3 407 415 1 1 TALIGN T0061 0 61 64 1PYS B 3 426 429 1 1 TALIGN T0061 0 73 79 1PYS B 3 458 464 1 1
t0075 1jvr???? VAST: 25 residues out of 110? T0075 74 89 1JVR 20 35 T0075 109 112 1JVR 68 71 T0075 124 128 1JVR 86 90
t0077 3.13 (1sra) ?
t0079 1.4 (1msec, 1blo? and others) EXCLUDED;

TARGETS THAT STILL ARE BLIND PREDICTIONS:

TARGET CORRECT SCOP ID. (a pdb example)
t0045
t0051
t0072
t0078

TARGETS WITH NEW FOLDS
T0052
T0056



Other assessment data:   Policemen  should report, in addition to their validation the following info:

 -  Average server's time response.
 -  Availability of the server.
 -  Quality of the interface, ease of use, ease of understanding the results, etc.


Please send your comments to dfischer@cs.bgu.ac.il. Valuable suggestions will be incorporated until 12.31.98. Other suggestions will be compiled here for future use in CAFASP2.



YOU ARE VISITOR NUMBER