Statistical analysis of large and sparse graphs is a challenging problem in data science due to the high dimensionality and nonlinearity of the problem. This paper presents a fast and scalable algorithm for partitioning such graphs into disjoint groups based on observed graph distances from a set of reference nodes. The resulting partition provides a low-dimensional approximation of the full distance matrix which helps to reveal global structural properties of the graph using only small samples of the distance matrix. The presented algorithm is inspired by the information-theoretic minimum description principle. We investigate the performance of this algorithm for selected real data sets and for synthetic graph data sets generated using sto...
AbstractNew classes of random graphs have recently been shown to exhibit the small world phenomenon—...
Statistical network modelling has focused on representing the graph as a discrete structure, namely ...
We describe and illustrate a novel algorithm for clustering a large number of time series into few '...
Statistical analysis of large and sparse graphs is a challenging problem in data science due to the ...
We analyze the performance of regular decomposition, a method for compression of large and dense gra...
A method for compression of large graphs and matrices to a block structure is further developed. Sze...
Abstract We analyze the performance of regular decomposition, a method for compression of large and ...
Network data arises naturally in many domains - from protein-protein interaction networks in biology...
Given a large sparse graph, how can we find patterns and anomalies? Several important applications c...
The problem of detecting dense subgraphs (communities) in large sparse graphs is inherent to many re...
Graph clustering is a fundamental computational problem with a number of applications in algorithm d...
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivia...
Graph clustering involves the task of partitioning nodes, so that the edge density is higher within ...
Gaussian graphical models are useful to analyze and visualize conditional dependence relationships b...
Abstract. We study the design of local algorithms for massive graphs. A local graph algorithm is one...
AbstractNew classes of random graphs have recently been shown to exhibit the small world phenomenon—...
Statistical network modelling has focused on representing the graph as a discrete structure, namely ...
We describe and illustrate a novel algorithm for clustering a large number of time series into few '...
Statistical analysis of large and sparse graphs is a challenging problem in data science due to the ...
We analyze the performance of regular decomposition, a method for compression of large and dense gra...
A method for compression of large graphs and matrices to a block structure is further developed. Sze...
Abstract We analyze the performance of regular decomposition, a method for compression of large and ...
Network data arises naturally in many domains - from protein-protein interaction networks in biology...
Given a large sparse graph, how can we find patterns and anomalies? Several important applications c...
The problem of detecting dense subgraphs (communities) in large sparse graphs is inherent to many re...
Graph clustering is a fundamental computational problem with a number of applications in algorithm d...
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivia...
Graph clustering involves the task of partitioning nodes, so that the edge density is higher within ...
Gaussian graphical models are useful to analyze and visualize conditional dependence relationships b...
Abstract. We study the design of local algorithms for massive graphs. A local graph algorithm is one...
AbstractNew classes of random graphs have recently been shown to exhibit the small world phenomenon—...
Statistical network modelling has focused on representing the graph as a discrete structure, namely ...
We describe and illustrate a novel algorithm for clustering a large number of time series into few '...