AbstractKernel k-Means is a basis for many state of the art global clustering approaches. When the number of samples grows too big, however, it is extremely time-consuming to compute the entire kernel matrix and it is impossible to store it in the memory of a single computer. The algorithm of Approximate Kernel k-Means has been proposed, which works using only a small part of the kernel matrix. The computation of the kernel matrix, even a part of it, remains a significant bottleneck of the process. Some types of kernel, however, can be computed using matrix multiplication. Modern CPU architectures and computational optimization methods allow for very fast matrix multiplication, thus those types of kernel matrices can be computed much faster...