Sign In
אוניברסיטת בן-גוריון בנגב
המחלקה למדעי המחשב

Minimax-optimal semi-supervised regression on unknown manifolds - Amit Moscovich Eiger

20/06/2017 12:00 - 14:00
Room 202 in building 37

In recent years, many semi-supervised regression and classification
methods have been proposed. These methods demonstrated empirical
success on some data sets, whereas on others the unlabeled data did
not appear to help.
To analyze semi-supervised learning theoretically, it is often assumed
that the data points lie on a low-dimensional manifold. Under this
assumption [1] and [2] have shown that classical nonparametric
regression methods, using only the labeled data, can achieve optimal
rates of convergence. This implies that asymptotically, as the number
of labeled points tends to infinity, unlabeled data does not help.
However, typical semi-supervised scenarios involve few labeled points,
and plenty of unlabeled ones.
In this work ([3]) we clarify the potential benefits of unlabeled data
under the manifold assumption, given a fixed amount of labeled points.
Specifically, we prove that for a Lipschitz function on a manifold, a
simple semi-supervised regression method based on geodesic
k-nearest-neighbors achieves the finite-sample minimax bound on the
mean squared error, provided that sufficiently many unlabeled points
are available. Furthermore, we show that this approach is
computationally efficient, requiring only O(k N log N) operations to
estimate the regression function for all N labeled and unlabeled
points. We illustrate this approach on two datasets with a manifold
structure: indoor localization using WiFi fingerprints and facial pose
estimation. In both cases, the proposed method is more accurate and
much faster than the popular Laplacian eigenvector regressor [4].
The talk should be accessible to anyone with a general background in
statistics and machine learning. Specifically, no knowledge of
manifold geometry or minimax theory is assumed.

To All Events