Private Information Retrieval

Private Information Retrieval (PIR) protocols allow users to retrieve information from a database while keeping their query private. Motivating examples for this problem include databases with sensitive information, such as stocks, patents or medical databases, in which users are likely to be highly motivated to hide which record they are trying to retrieve. PIR protocols aim at achieving this goal efficiently, where the main cost measure is communication complexity. PIR is a strong primitive, which may also be useful as a building block within other protocols.

Survey Talk

Publications

  • A. Beimel, Y. Ishai, T. Malkin, and E. Kushilevitz. One-way functions are essential for single-server private information retrieval. In Proc. of the 31st Annu. ACM Symp. on the Theory of Computing (STOC), pages 89-98, 1999.
    Single database PIR with sublinear communication complexity cannot be achieved in the information theoretic model, so some computational assumptions must be made. Previous work consists of presenting specific computational assumptions under which such schemes can be constructed, such as quadratic residuosity (Kushilevitz-Ostrovsky), or the Phi-hiding assumption (Cachin-Micali-Stadler). A major question which we address in this work is what assumption is the minimal assumption necessary for single database PIR with sublinear communication complexity. We prove that if there is a PIR protocol in which the server sends less than n bits then one-way functions exist (where n is the number of bits in the database). That is, even saving one bit compared to the naive protocol, in which the entire database is sent, already requires one-way functions. Similar results hold (but requires more work) even if we allow the retrieval to fail with some probability.

  • A. Beimel, Y. Ishai, and T. Malkin. Reducing the Servers Computation in Private Information Retrieval: PIR with Preprocessing (ps), (pdf). CRYPTO 2000, vol. 1880 of LNCS, pages 56-74. Springer, 2000.
    In all previous private information retrieval protocols the servers' computation for each retrieval is at least linear in the size of entire database, even if the user requires just one bit. In this paper, we study the computational complexity of PIR. We show that in the standard PIR model, where the servers hold only the database, linear computation cannot be avoided. To overcome this problem we propose the model of PIR with preprocessing: Before the execution of the protocol each server may compute and store polynomially-many information bits regarding the database; later on, this information should enable the servers to answer each query of the user with more efficient computation. We demonstrate that preprocessing can significantly save work by presenting protocols with various tradeoff between communication, computation, and space. We also suggest two alternative models to saving computation, by batching queries and by allowing a separate off-line interaction per future query.

  • A. Beimel and Y. Ishai. Information-Theoretic Private Information Retrieval: A Unified Construction (ps), (pdf). Electronic Colloquium on Computational Complexity, TR01-15, 2001. Extended abstract in: ICALP 2001, volume 2076 of LNCS, pages 89-98, 2001. (ps), (pdf).
    This work addresses the information-theoretic setting for PIR, in which the user's privacy should be unconditionally protected from collusions of servers. We present a unified general construction, whose abstract components can be instantiated to yield both old and new families of PIR protocols. A main ingredient in the new protocols is a generalization of a solution by Babai, Kimmel, and Lokam to a communication complexity problem in the so-called simultaneous messages model.

    Our construction strictly improves upon previous constructions and resolves some previous anomalies. In particular, we obtain: (1) Efficient t-private k-server PIR protocols; (2) a constant-factor improvement in the communication complexity of 1-private 2-server PIR, providing the first improvement to the 2-server case since PIR protocols were introduced; and (3) efficient PIR protocols with logarithmic query length. The latter protocols have applications to the construction of efficient families of locally decodable codes over large alphabets and to PIR protocols with reduced work by the servers.

  • A. Beimel, Y. Ishai, E. Kushilevitz, and J. F. Raymond. Breaking the O(n^{1/(2k-1)}) Barrier for Information-Theoretic Private Information Retrieval. (ps), (pdf). In Proc. of the 43rd IEEE Symposium on Foundations of Computer Science (FOCS), 2002. To appear.
    Prior to this work, the best k-server private information retrieval protocols had complexity O(n^{1/(2k-1)}) for any fixed k (Ambainis, 1997). Since then several methods were developed in an attempt to beat this bound, but magically all these methods obtained the same bound. In this work, this barrier is finally broken and the complexity of information-theoretic k-server PIR is significantly improved to $n^{O(\frac{\log \log k}{k \log k})}$. The new PIR protocols can also be used to construct k-query locally decodable codes with length $\exp(n^{O(\frac{\log \log
k}{k \log k})})$ over a binary alphabet, compared to $\exp(n^{\frac{1}{k -1}})$ in previously known constructions. The improvements presented in this paper apply even for small values of k: the PIR protocols are more efficient than previous ones for every k > 2, and the locally decodable codes are shorter for every k >3.

  • A. Beimel and Y. Stahl. Robust Information-Theoretic Private Information Retrieval. (ps), (pdf). In Proc. of the 3rd Conference on Security in Communication Networks, 2002.
    The definition of PIR protocols raises a simple question - what happens if one of the servers crashes during the operation? How can we devise a protocol which still works in the presence of crashing servers? Current systems do not guarantee availability of servers at all times for many reasons - crash of server, communication problems. Our purpose is to design robust PIR protocols, i.e., protocols which still work correctly even if only some of the servers are available during the protocol's operation (the user does not know in advance which servers are available). We present various robust PIR protocols giving different tradeoffs between the different parameters.

    Links: