Data summarization, a central challenge in machine learning, is the task of finding a representative subset of manageable size out of a large dataset. It has found numerous applications, including image summarization, document and corpus summarization, recommender systems, and non-parametric learning, to name a few. A general recipe to obtain a faithful summary is to turn the problem into selecting a subset of data elements optimizing a utility function that quantifies “representativeness” of the selected set. Often times, the choice of utility functions used for summarization exhibit submodularity, a natural diminishing returns property. In words, submodularity implies that the added value of any element from the dataset decreases as we ...
We study the problem of extracting a small subset of representative items from a large data stream. ...
International audienceThe growing need to deal with massive instances motivates the design of algori...
In this work, we give a new parallel algorithm for the problem of maximizing a non-monotone dimini...
We study the classical problem of maximizing a monotone submodular function subject to a cardinality...
The problem of selecting a small-size representative summary of a large dataset is a cornerstone of ...
Thesis (Ph.D.)--University of Washington, 2020In the information age, vast volumes of data are gener...
The need for real time analysis of rapidly producing data streams (e.g., video and image streams) mo...
A wide variety of problems in machine learning, including exemplar clustering, document summarizatio...
We study the problem of maximizing a non-monotone submodular function subject to a cardinality const...
We address the problem of maximizing an unknown submodular function that can only be accessed via no...
In this paper, we present a supervised learn-ing approach to training submodular scoring functions f...
Constrained submodular maximization problems encompass a wide variety of applications, including per...
We address the problem of image collection summarization by learning mixtures of submodular function...
Constrained submodular maximization problems encompass a wide variety of applications, including per...
In this manuscript, we offer a gentle review of submodularity and supermodularity and their properti...
We study the problem of extracting a small subset of representative items from a large data stream. ...
International audienceThe growing need to deal with massive instances motivates the design of algori...
In this work, we give a new parallel algorithm for the problem of maximizing a non-monotone dimini...
We study the classical problem of maximizing a monotone submodular function subject to a cardinality...
The problem of selecting a small-size representative summary of a large dataset is a cornerstone of ...
Thesis (Ph.D.)--University of Washington, 2020In the information age, vast volumes of data are gener...
The need for real time analysis of rapidly producing data streams (e.g., video and image streams) mo...
A wide variety of problems in machine learning, including exemplar clustering, document summarizatio...
We study the problem of maximizing a non-monotone submodular function subject to a cardinality const...
We address the problem of maximizing an unknown submodular function that can only be accessed via no...
In this paper, we present a supervised learn-ing approach to training submodular scoring functions f...
Constrained submodular maximization problems encompass a wide variety of applications, including per...
We address the problem of image collection summarization by learning mixtures of submodular function...
Constrained submodular maximization problems encompass a wide variety of applications, including per...
In this manuscript, we offer a gentle review of submodularity and supermodularity and their properti...
We study the problem of extracting a small subset of representative items from a large data stream. ...
International audienceThe growing need to deal with massive instances motivates the design of algori...
In this work, we give a new parallel algorithm for the problem of maximizing a non-monotone dimini...