ICLR Poster On Accelerated Perceptrons and Beyond

Virtual presentation / poster accept

On Accelerated Perceptrons and Beyond

Guanghui Wang · Rafael Hanashiro · Etash Guha · Jacob Abernethy

Keywords: [ Perceptron ] [ optimistic online learning ] [ implicit bias ] [ game ] [ margin maximization ] [ Theory ]

[ Abstract ]

[ OpenReview]

Abstract: The classical Perceptron algorithm of Rosenblatt can be used to find a linear threshold function to correctly classify

n

$n$ linearly separable data points, assuming the classes are separated by some margin

γ > 0

$\gamma > 0$ . A foundational result is that Perceptron converges after

Ω (1 / γ^{2})

$\Omega(1/\gamma^{2})$ iterations. There have been several recent works that managed to improve this rate by a quadratic factor, to

Ω (\sqrt{log n} / γ)

$\Omega(\sqrt{\log n}/\gamma)$ , with more sophisticated algorithms. In this paper, we unify these existing results under one framework by showing that they can all be described through the lens of solving min-max problems using modern acceleration techniques, mainly through \emph{optimistic} online learning. We then show that the proposed framework also leads to improved results for a series of problems beyond the standard Perceptron setting. Specifically, a) for the margin maximization problem, we improve the state-of-the-art result from

O (log t / t^{2})

$O(\log t/t^2)$ to

O (1 / t^{2})

$O(1/t^2)$ , where

t

$t$ is the number of iterations; b) we provide the first result on identifying the implicit bias property of the classical Nesterov's accelerated gradient descent (NAG) algorithm, and show NAG can maximize the margin with an

O (1 / t^{2})

$O(1/t^2)$ rate; c) for the classical

p

$p$ -norm Perceptron problem, we provide an algorithm with

Ω (\sqrt{(p - 1) log n} / γ)

$\Omega(\sqrt{(p-1)\log n}/\gamma)$ convergence rate, while existing algorithms suffer the

Ω ((p - 1) / γ^{2})

$\Omega({(p-1)}/\gamma^2)$ convergence rate.

Chat is not available.