Artifact for the paper "Near-Optimal Sparse Allreduce for Distributed Deep Learning", published in PPoPP'202
The potential to solve complex problems along with the performance that deep learning offers has mad...
Decentralized learning algorithms empower interconnected devices to share data and computational res...
This artifact generates figures of the submitted draft of “Register Tiling for Unstructured Sparsity...
Artifact for the paper "Near-Optimal Sparse Allreduce for Distributed Deep Learning", published in P...
Communication overhead is one of the major obstacles to train large deep learning models at scale. G...
The artifact for the paper Sequential Reasoning for Optimizing Compilers Under Weak Memory Concurren...
Artifact for the paper titled "Multicore Parallelism in Permanence-based Community Detection Algorit...
In data-parallel optimization of machine learning models, workers collaborate to improve their estim...
This is the research artifact for the SC23 paper "Unified Communication Optimization Strategies for ...
This thesis presents a few methods to accelerate the inference of Deep Neural Networks that are lar...
This archive includes source codes and benchmarks for paper: "G-Sparse: Compiler-Driven Acceleration...
We propose FlexReduce, an efficient and flexible all-reduce algorithm for distributed deep learning ...
The success of deep learning may be attributed in large part to remarkable growth in the size and co...
This is the artifact that accompanies the paper "Visibility Algorithms for Dynamic Dependence Analys...
This thesis proposes parallel and distributed algorithms for solving very largescale sparse optimiza...
The potential to solve complex problems along with the performance that deep learning offers has mad...
Decentralized learning algorithms empower interconnected devices to share data and computational res...
This artifact generates figures of the submitted draft of “Register Tiling for Unstructured Sparsity...
Artifact for the paper "Near-Optimal Sparse Allreduce for Distributed Deep Learning", published in P...
Communication overhead is one of the major obstacles to train large deep learning models at scale. G...
The artifact for the paper Sequential Reasoning for Optimizing Compilers Under Weak Memory Concurren...
Artifact for the paper titled "Multicore Parallelism in Permanence-based Community Detection Algorit...
In data-parallel optimization of machine learning models, workers collaborate to improve their estim...
This is the research artifact for the SC23 paper "Unified Communication Optimization Strategies for ...
This thesis presents a few methods to accelerate the inference of Deep Neural Networks that are lar...
This archive includes source codes and benchmarks for paper: "G-Sparse: Compiler-Driven Acceleration...
We propose FlexReduce, an efficient and flexible all-reduce algorithm for distributed deep learning ...
The success of deep learning may be attributed in large part to remarkable growth in the size and co...
This is the artifact that accompanies the paper "Visibility Algorithms for Dynamic Dependence Analys...
This thesis proposes parallel and distributed algorithms for solving very largescale sparse optimiza...
The potential to solve complex problems along with the performance that deep learning offers has mad...
Decentralized learning algorithms empower interconnected devices to share data and computational res...
This artifact generates figures of the submitted draft of “Register Tiling for Unstructured Sparsity...