link

February 27, Tuesday
12:00 – 14:00

k-Anonymization with Minimal Loss of Information
Computer Science seminar
Lecturer : Mr. Tamir Tassa
Affiliation : CS, The Open University
Location : 202/37
Host : Dr. Michael Elkin
The technique of k-anonymization allows the releasing of databases that contain personal information while ensuring some degree of individual privacy. Given a database D that needs to be released, one produces a so-called k-anonymized version of that database where each record is indistinguishable from at least k-1 additional records. The anonymization process is usually performed by suppressing or generalizing database entries. We formally study the concept of generalization, and propose two information-theoretic measures for capturing the amount of information that is lost during the anonymization process. We call these two measures, the entropy measure and the non-uniform entropy measure.

The proposed measures are more eneral and more accurate than the measures that were proposed by Meyerson and Williams ([MW]),and Aggarwal et al. ([AFK]). We then study the problem of achieving k-anonymity with minimal loss of information. We prove that this problem is NP-hard and then study polynomial approximations for the optimal solution. Our first algorithm relies on similar ideas as the approximation algorithm that was proposed in [MW]. It gives an approximation guarantee of $O(ln k)$, for the entropy measure as well as for the previously studied measures. This mproves the best-known $O(k)$-approximation of [AFK]. While the approximation algorithms of [AFK] and [MW] relied on the so-called graph representation framework, which was shown in [AFK] to be limited to Omega(k)-approximations, our algorithm relies on a novel hypergraph representation that enables the improvement in the approximation ratio from $O(k)$ to $O(ln k)$.

As the running time of the algorithm is $O(n^(2k))$,we also show how to adapt the algorithm of [AFK] inorder to obtain a strongly polynomial approximation algorithm for our entropy measure with approximation guarantee of $O(k)$. We leave as an open problem to design an approximation algorithm, strongly polynomial or not, for the non-uniform entropy measure.

Joint work with Aristides Gionis