The Tree Snipping algorithms

were designed to improve the biological significance of the clusters produced by hierarchical clustering.
These algorithms partition the hierarchical-tree by snipping: cutting selected edges at variable levels.
[Bioinformatics, 2009]
















As a practical application, the Minimum Discrepancy algorithm was used to analyze the breast cancer expression data of (Veer et al. 2002). The genes were grouped into 100 clusters, with all known GO-BP annotations that are not too specific (at least 50 genes participate in the process) serving as labels.

The full list of clusters and the genes included in each cluster
is available here (Excel file).


Snipper, the first web-server designed to generate clusterings that integrate expression data with GO annotations, will be available soon!