This paper introduces two randomized preconditioning techniques for robustly solving kernel ridge regression (KRR) problems with a medium to large number of data points ($10^4 \leq N \leq 10^7$). The first method, RPCholesky preconditioning, is capable of accurately solving the full-data KRR problem in $O(N^2)$ arithmetic operations, assuming sufficiently rapid polynomial decay of the kernel matrix eigenvalues. The second method, KRILL preconditioning, offers an accurate solution to a restricted version of the KRR problem involving $k \ll N$ selected data centers at a cost of $O((N + k^2) k \log k)$ operations. The proposed methods solve a broad range of KRR problems and overcome the failure modes of previous KRR preconditioners, making the...
Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine...
Nonparametric inference techniques provide promising tools for probabilistic reasoning in high-dime...
Gaussian processes (GPs) produce good probabilistic models of functions, but most GP kernels require...
The task of choosing a preconditioner M to use when solving a linear system Ax=b with iterative meth...
Kernel ridge regression, KRR, is a generalization of linear ridge regression that is non-linear in t...
A primary computational problem in kernel regression is solution of a dense linear system with the N...
International audienceLarge-scale kernel ridge regression (KRR) is limited by the need to store a la...
Gaussian process hyperparameter optimization requires linear solves with, and log-determinants of, l...
Kernel machines have sustained continuous progress in the field of quantum chemistry. In particular,...
One approach to improving the running time of kernel-based machine learning methods is to build a sm...
The computational and storage complexity of kernel machines presents the primary barrier to their sc...
Recent theoretical studies illustrated that kernel ridgeless regression can guarantee good generaliz...
We propose and study kernel conjugate gradient methods (KCGM) with random projections for least-squa...
International audienceMost kernel-based methods, such as kernel regression, kernel PCA, ICA, or k-me...
In this paper, we propose a fast surrogate leverage weighted sampling strategy to generate refined r...
Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine...
Nonparametric inference techniques provide promising tools for probabilistic reasoning in high-dime...
Gaussian processes (GPs) produce good probabilistic models of functions, but most GP kernels require...
The task of choosing a preconditioner M to use when solving a linear system Ax=b with iterative meth...
Kernel ridge regression, KRR, is a generalization of linear ridge regression that is non-linear in t...
A primary computational problem in kernel regression is solution of a dense linear system with the N...
International audienceLarge-scale kernel ridge regression (KRR) is limited by the need to store a la...
Gaussian process hyperparameter optimization requires linear solves with, and log-determinants of, l...
Kernel machines have sustained continuous progress in the field of quantum chemistry. In particular,...
One approach to improving the running time of kernel-based machine learning methods is to build a sm...
The computational and storage complexity of kernel machines presents the primary barrier to their sc...
Recent theoretical studies illustrated that kernel ridgeless regression can guarantee good generaliz...
We propose and study kernel conjugate gradient methods (KCGM) with random projections for least-squa...
International audienceMost kernel-based methods, such as kernel regression, kernel PCA, ICA, or k-me...
In this paper, we propose a fast surrogate leverage weighted sampling strategy to generate refined r...
Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine...
Nonparametric inference techniques provide promising tools for probabilistic reasoning in high-dime...
Gaussian processes (GPs) produce good probabilistic models of functions, but most GP kernels require...