link

June 24, Tuesday
12:00 – 14:00

A Universal Kernel for Learning Regular Languages
Computer Science seminar
Lecturer : Aryeh Kontorovich
Affiliation : Department of Mathematics Weizmann Institute of Science
Location : 202/37
Host : Prof. Ronen Brafman
We develop a novel approach to the classical problem of learning regular languages from labeled samples. Rather than attempting to construct small consistent automata (long known to be a very difficult computational problem), we embed the strings in a Hilbert space and compute a maximum-margin hyperplane, which becomes our classifier for new strings. We accomplish this via a universal kernel that renders all regular languages linearly separable. Under this kernel, the image of every regular language is linearly separable from its complement in some finite-dimensional space with a strictly positive margin. Thus, we are able to efficiently (in sample size) compute a maximum-margin separating hyperplane (via SVM, for example) and use margin bounds to control the generalization error.

A brute-force computation of this universal kernel has super-exponential complexity. We conjecture that this problem is intractable (a likely candidate for $#$P-complete). However, we propose a simple randomized scheme for efficiently obtaining an $eps$-approximation to our universal kernel. We show that the approximate kernel preserves the distances and margins with low distortion, and therefore may be used as a surrogate for the original one.

To our knowledge, the framework we propose is the only one capable of inducing unrestricted regular languages from labeled samples (modulo standard cryptographic limitations). Along the way, we touch upon several fundamental questions in complexity, automata, and machine learning.

Join work with Boaz Nadler.