In this thesis, we consider the three following problems: clustering in Bipartite Stochastic Block Model, estimation of topic-document matrix in topic model, and benign overfitting in nonparametric regression. First, we consider the graph clustering problem in the Bipartite Stochastic Block Model (BSBM). The BSBM is a non-symmetric generalization of the Stochastic Block Model, with two sets of vertices. We provide an algorithm called the Hollowed Lloyd's algorithm, which allows one to classify vertices of the smallest set with high probability. We provide statistical guarantees on this algorithm, which is computationnally fast and simple to implement. We establish a sufficient condition for clustering in BSBM. Our results improve on previo...
Il existe des situations de modélisation statistique pour lesquelles le problème classique de classi...
International audienceCo-clustering designs in a same exercise a simultaneous clustering of the rows...
The multiplication of information source and the development of news technologies generates complex ...
In this thesis, we consider the three following problems: clustering in Bipartite Stochastic Block M...
Topic models provide a useful tool to organize and understand the structure of large corpora of text...
Since the exponential growth of available Data (Big data), dimensional reduction techniques became e...
This habilitation thesis retraces works focusing mainly on model based clustering and the related is...
International audienceCo-clustering is more useful than one-sided clustering when dealing with high ...
This thesis proposes three original contributions for the clustering of particular types of data: mu...
Networks with community structure arise in many fields such as social science, biological science, a...
International audienceDue to the significant increase of communications between individuals via soci...
This thesis works mainly on three subjects. The first one is online clustering in which we introduce...
Notre capacité grandissante à collecter et stocker des données a fait de l'apprentissage non superv...
The stochastic block model (SBM) is a mixture model used for the clustering of nodes in networks. It...
Il existe des situations de modélisation statistique pour lesquelles le problème classique de classi...
International audienceCo-clustering designs in a same exercise a simultaneous clustering of the rows...
The multiplication of information source and the development of news technologies generates complex ...
In this thesis, we consider the three following problems: clustering in Bipartite Stochastic Block M...
Topic models provide a useful tool to organize and understand the structure of large corpora of text...
Since the exponential growth of available Data (Big data), dimensional reduction techniques became e...
This habilitation thesis retraces works focusing mainly on model based clustering and the related is...
International audienceCo-clustering is more useful than one-sided clustering when dealing with high ...
This thesis proposes three original contributions for the clustering of particular types of data: mu...
Networks with community structure arise in many fields such as social science, biological science, a...
International audienceDue to the significant increase of communications between individuals via soci...
This thesis works mainly on three subjects. The first one is online clustering in which we introduce...
Notre capacité grandissante à collecter et stocker des données a fait de l'apprentissage non superv...
The stochastic block model (SBM) is a mixture model used for the clustering of nodes in networks. It...
Il existe des situations de modélisation statistique pour lesquelles le problème classique de classi...
International audienceCo-clustering designs in a same exercise a simultaneous clustering of the rows...
The multiplication of information source and the development of news technologies generates complex ...