Constructing small-sized coresets for various clustering problems has attracted significant attention recently. We provide efficient coreset construction algorithms for $(k, z)$-Clustering with improved coreset sizes in several metric spaces. In particular, we provide an $\tilde{O}_z(k^{(2z+2)/(z+2)}\varepsilon^{-2})$-sized coreset for $(k, z)$-Clustering for all $z\geq 1$ in Euclidean space, improving upon the best known $\tilde{O}_z(k^2\varepsilon^{-2})$ size upper bound [Cohen-Addad, Larsen, Saulpic, Schwiegelshohn. STOC'22], breaking the quadratic dependency on $k$ for the first time (when $k\leq \varepsilon^{-1}$). For example, our coreset size for Euclidean $k$-Median is $\tilde{O}(k^{4/3} \varepsilon^{-2})$, improving the best known ...
We construct near-optimal coresets for kernel density estimate for points in R^d when the kernel is ...
Center-based clustering is a fundamental primitive for data analysis and is very challenging for lar...
We study the problem of finding an optimum clustering, a problem known to be NP-hard. Existing liter...
Motivated by practical generalizations of the classic $k$-median and $k$-means objectives, such as c...
We study the problem of constructing epsilon-coresets for the (k, z)-clustering problem in a doublin...
In this paper, we show that there exists a (k, ε)-coreset for k-median and k-means clustering of n p...
The k-means problem seeks a clustering that minimizes the sum of squared errors cost function: For i...
Clustering or cluster analysis \cite{hastieetal09} is a classical method in unsupervised learning an...
In this thesis we show that, for several clustering problems, we can extract a small set of points, ...
This thesis investigates the following general constrained clustering problem: given a dimension $d$...
In this paper, we show that for several clustering problems one can extract a small set of points, s...
Coresets are among the most popular paradigms for summarizing data. In particular, there exist many ...
Let $\epsilon>0$ be any constant and let $P$ be a set of $n$ points in $\mathbb{R}^d$. We design new...
Coresets are succinct summaries of large datasets such that, for a given problem, the solution obtai...
In k-Clustering we are given a multiset of n vectors X subset Z^d and a nonnegative number D, and we...
We construct near-optimal coresets for kernel density estimate for points in R^d when the kernel is ...
Center-based clustering is a fundamental primitive for data analysis and is very challenging for lar...
We study the problem of finding an optimum clustering, a problem known to be NP-hard. Existing liter...
Motivated by practical generalizations of the classic $k$-median and $k$-means objectives, such as c...
We study the problem of constructing epsilon-coresets for the (k, z)-clustering problem in a doublin...
In this paper, we show that there exists a (k, ε)-coreset for k-median and k-means clustering of n p...
The k-means problem seeks a clustering that minimizes the sum of squared errors cost function: For i...
Clustering or cluster analysis \cite{hastieetal09} is a classical method in unsupervised learning an...
In this thesis we show that, for several clustering problems, we can extract a small set of points, ...
This thesis investigates the following general constrained clustering problem: given a dimension $d$...
In this paper, we show that for several clustering problems one can extract a small set of points, s...
Coresets are among the most popular paradigms for summarizing data. In particular, there exist many ...
Let $\epsilon>0$ be any constant and let $P$ be a set of $n$ points in $\mathbb{R}^d$. We design new...
Coresets are succinct summaries of large datasets such that, for a given problem, the solution obtai...
In k-Clustering we are given a multiset of n vectors X subset Z^d and a nonnegative number D, and we...
We construct near-optimal coresets for kernel density estimate for points in R^d when the kernel is ...
Center-based clustering is a fundamental primitive for data analysis and is very challenging for lar...
We study the problem of finding an optimum clustering, a problem known to be NP-hard. Existing liter...