My Research
Publications:
-
Ran Yahalom, Dan Reshef, Ayana Wiener, Sagiv Frankel, Nir Kalisman, Boaz Lerner, Chen Keasar: Structure-based identification of catalytic residues. PROTEINS 2011: accepted, to be published in next available issue.
-
Ran Yahalom, Erez Shmueli, Tomer Zrihen: Constrained Anonymization of Production Data: A Constraint Satisfaction Problem Approach. Secure Data Management 2010: 41-53.
-
Yaron Gonen, Nurit Gal-Oz, Ran Yahalom, Ehud Gudes: CAMLS: A Constraint-Based Apriori Algorithm for Mining Long Sequences. DASFAA (1) 2010: 63-77.
-
AFP-Biosapiens 2007 conference: presented a poster on "Identification catalytic residues based on destabilizing properties and an SVM classifier" (19-20/7/07).
See the conference program here. -
The 1st MIGAL labs conference in Kfar-Blum: presented my findings in a preliminary phase of an ongoing vaccine development study (23/2/2005).
Thesis:
"Structure Based Prediction of Catalytic Residues":I developed a structure-based approach to the identification of catalytic residues. The new method is motivated by the difficulty of evolution-based methods to annotate "orphan" Structural Genomics targets, which have very few or no homologs in the databases. My approach is based on the tendency of catalytic residues to be spatially clustered and occluded, on their destabilizing role in the structures and on the differences in catalytic propensity between residue types. These observations were quantified by energy terms and their predictive power augmented by spatial averaging and Z-score transformation. A subset of these energy terms was selected and used to train a support vector machine that discriminates catalytic from non-catalytic residues. Special attention was given to the class imbalance problem which is inherent to the catalytic residue prediction domain and can extremely deteriorate the performance of the SVM. My method showed good performance on a dataset of 34 enzymes that was specifically designed to mimic the Structural Genomics scenario. Each of the proteins in this dataset has a unique CATH fold topology and none were solved with a ligand bound to the active-site. I also tested my method on two datasets that were taken from previous studies. In both cases, results using my method outperformed the reported structure-based results.

