The
following projects are offered by Prof.
Ehud Gudes
For: 1-2 students with background in
Databases and like to Algorithms
A graph mining
benchmark was developed in a previous project and has implemented several well
known algorithms. The goal of this project is to continue this development and
add more algorithms and databases and do more extensive testing.
A report comparing the various
mining methods will be required.
Co-advisor: Natalia
Vanetik
For: 1-2 students with background in
Databases and like to Algorithms
This project is
composed of two parts. The first part is reading papers and survey methods for
constructing various types of indexes for XML.
The second part is implementing several types of
such indexes and comparing their performance on several benchmark files.
A report comparing the index methods will be
required.
For: 1-2 students with background in
Databases and/or Data security
The goal of the project is to identify corrupted records (or fields) in a
database after a successful intrusion.
The first task is to read and understand learning techniques which are used in
Data mining. The second task will be
to train a learning model (for example an HMM model - Jajodia's
paper) and then identify corrupted records using it.
This technique can be tested and demonstrated on a students-grades
database.
For: 1-2 students with
background in Databases and/or Data security
The problem of identifying
groups of trust (knots) in a trust network is modeled as a graph clustering
problem,
where vertices correspond to individual items and edges
describe relationships. Under this interpretation, a community is represented
by a directed graph,
in which vertices represent
members and edges represent the trust relations between the members represented
by their end-point vertices.
A path between two vertices that are not connected by an edge represents
the transitive property of trust (e.g. Alice trusts Bob + Bob trusts Clair
=> Alice trusts Clair).
Correlation clustering is a powerful technique for
discovering groups of trust in graphs. It operates on the pair-wise
relationships between vertices, partitioning the graph to maximize the number
of related pairs that are clustered together, plus the number of unrelated
pairs that are separated. We investigate heuristic algorithms for correlation
clustering with restricted clusters diameter size (to avoid trust based on long
paths of transitive trust).
The goal of the project is to implement the developed
heuristics and evaluate the tradeoff of optimality/performance.
Co-advisor: Nurit Gal-Oz
For: 2-3 students with
background in Databases and/or Data security
Map/Reduce is a recent
technique for implementing parallel data intensive algorithms. For example the Hadoop project.
The goal of this project is to implement several data mining algorithms
(e.g. FSG, GSPAN, SPADE, SUBSEA) and compare them to their
Sequential version.
For: 1-2 students with
background in Databases or
AI (CSP)
Complex web sites (e.g.
government sites) present the problem of "finding the right path"
within the web page when searching for certain information or form.
Danny Duetch from TAU has demonstrated such
navigator helper. The problem may be phrased as a combination of Query and
Planning and CSP.
The goal of the project is to implement such help facility
using CSP techniques.
Co-advisor: Prof. Amnon Meisels
For: 1-2 students with
background in Databases