March 23, Tuesday
12:00 – 13:30
Reconstructing approximate phylogenetic trees from quartet samples
Computer Science seminar
Lecturer : Sagi Snir
Affiliation : University of Haifa
Location : 202/37
Host : Dr. Michal ZIiv-Ukelson
The reconstruction of phylogenetic trees (evolutionary trees) is central to many problems in Biology. Accurate reconstruction methods are currently limited to a maximum of few dozens of species.
Therefore, in order to construct a tree over larger sets of species, a method capable of inferring accurately trees over small, overlapping sets, and subsequently merging these sets into a tree over the complete set, is required.
A "quartet tree" (a non-rooted phylogenetic tree over four species) is the smallest informative piece of information, and the extensively-studied "quartet approach" is based on combining quartet trees into a big tree. However, even this case is NP-hard, and remains so even when the set of quartet trees is compatible (agree on a certain tree).
The general problem of approximating quartets is known as the "maximum quartet consistency" (MQC), even for compatible inputs, is open for nearly twenty years. Despite its importance, the only rigorous results for approximating quartets are the naive $1/3$ approximation that applies to the general case and a PTAS when the input is the complete set of all ${n choose 4}$ possible quartets.
Even when it is possible to determine the correct quartet induced by every four species, the time needed to generate the complete set of all quartets may be impractical. A faster approach is to sample at random just $m << {n choose 4}$ quartets, and provide this sample as an input.
We present the first approximation algorithm whose guaranteed approximation is strictly better than $1/3$ when the input is any random sample of $m$ compatible quartets. The approximation ratio we obtain is close to $1/2$, and experimental results suggest that the ratio is much larger.
An important ingredient in our algorithm involves solving a weighted MaxCut in a certain graph induced by the set of input quartets. We also show an extension of the PTAS algorithm to handle dense, rather than complete, inputs.
Joint work with Raphy Yuster.