In recent centralized nonconvex distributed learning and federated learning, local methods are one of the promising approaches to reduce communication time. However, existing work has mainly focused on studying first-order optimality guarantees. On the other side, second-order optimality guaranteed algorithms, i.e., algorithms escaping saddle points, have been extensively studied in the non-distributed optimization literature. In this paper, we study a new local algorithm called Bias-Variance Reduced Local Perturbed SGD (BVR-L-PSGD), that combines the existing bias-variance reduced gradient estimator with parameter perturbation to find second-order optimal points in centralized nonconvex distributed optimization. BVR-L-PSGD enjoys second-or...
Machine learning and reinforcement learning have achieved tremendous success in solving problems in ...
The first part of this dissertation considers distributed learning problems over networked agents. T...
We present a new method that includes three key components of distributed optimization and federated...
This paper focuses on the distributed optimization of stochastic saddle point problems. The first pa...
Rapid advances in data collection and processing capabilities have allowed for the use of increasing...
Distributed parallel stochastic gradient descent algorithms are workhorses for large scale machine l...
Decentralized learning over distributed datasets can have significantly different data distributions...
The diffusion strategy for distributed learning from streaming data employs local stochastic gradien...
Federated optimization (FedOpt), which targets at collaboratively training a learning model across a...
Variational inequalities in general and saddle point problems in particular are increasingly relevan...
Machine learning models often converge slowly and are unstable due to the significant variance of ra...
We consider the distributed stochastic optimization problem where $n$ agents want to minimize a glob...
Decentralized learning algorithms empower interconnected devices to share data and computational res...
We consider the problem of communication efficient distributed optimization where multiple nodes exc...
Modern machine learning systems pose several new statistical, scalability, privacy and ethical chall...
Machine learning and reinforcement learning have achieved tremendous success in solving problems in ...
The first part of this dissertation considers distributed learning problems over networked agents. T...
We present a new method that includes three key components of distributed optimization and federated...
This paper focuses on the distributed optimization of stochastic saddle point problems. The first pa...
Rapid advances in data collection and processing capabilities have allowed for the use of increasing...
Distributed parallel stochastic gradient descent algorithms are workhorses for large scale machine l...
Decentralized learning over distributed datasets can have significantly different data distributions...
The diffusion strategy for distributed learning from streaming data employs local stochastic gradien...
Federated optimization (FedOpt), which targets at collaboratively training a learning model across a...
Variational inequalities in general and saddle point problems in particular are increasingly relevan...
Machine learning models often converge slowly and are unstable due to the significant variance of ra...
We consider the distributed stochastic optimization problem where $n$ agents want to minimize a glob...
Decentralized learning algorithms empower interconnected devices to share data and computational res...
We consider the problem of communication efficient distributed optimization where multiple nodes exc...
Modern machine learning systems pose several new statistical, scalability, privacy and ethical chall...
Machine learning and reinforcement learning have achieved tremendous success in solving problems in ...
The first part of this dissertation considers distributed learning problems over networked agents. T...
We present a new method that includes three key components of distributed optimization and federated...