Serverless computing (FaaS) has been extensively utilized for deep learning (DL) inference due to the ease of deployment and pay-per-use benefits. However, existing FaaS platforms utilize GPUs in a coarse manner for DL inferences, without taking into account spatio-temporal resource multiplexing and isolation, which results in severe GPU under-utilization, high usage expenses, and SLO (Service Level Objectives) violation. There is an imperative need to enable an efficient and SLO-aware GPU-sharing mechanism in serverless computing to facilitate cost-effective DL inferences. In this paper, we propose \textbf{FaST-GShare}, an efficient \textit{\textbf{Fa}aS-oriented \textbf{S}patio-\textbf{T}emporal \textbf{G}PU \textbf{Sharing}} architecture...
Serverless computing platforms represent the fastest-growing segment of cloud services and are predi...
Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud,...
Deep learning is an emerging workload in the field of HPC. This powerful method of resolution is abl...
To accelerate the training of Deep Learning (DL) models, clusters of machines equipped with hardware...
Our work seeks to improve and adapt computing systems and machine learning (ML) algorithms to match ...
The Deep Learning (DL) paradigm gained remarkable popularity in recent years. DL models are used to ...
DL has pervaded many areas of computing due to the confluence of the explosive growth of large-scale...
Recent decades have witnessed the breakthrough of deep learning algorithms, which have been widely u...
GPU technology has been improving at an expedited pace in terms of size and performance, empowering ...
Serverless computing is an integral part of the recent success of cloud computing, offering cost and...
The invention of deep belief network (DBN) provides a powerful tool for data modeling. The key advan...
Deep Learning (DL) models have achieved superior performance. Meanwhile, computing hardware like NVI...
To accelerate the inference of machine-learning (ML) model serving, clusters of machines require the...
Deep Learning (DL) methods currently address a variety of complex tasks. GPUs significantly accelera...
Deep learning (DL) training jobs bring some unique challenges to existing cluster managers, such as ...
Serverless computing platforms represent the fastest-growing segment of cloud services and are predi...
Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud,...
Deep learning is an emerging workload in the field of HPC. This powerful method of resolution is abl...
To accelerate the training of Deep Learning (DL) models, clusters of machines equipped with hardware...
Our work seeks to improve and adapt computing systems and machine learning (ML) algorithms to match ...
The Deep Learning (DL) paradigm gained remarkable popularity in recent years. DL models are used to ...
DL has pervaded many areas of computing due to the confluence of the explosive growth of large-scale...
Recent decades have witnessed the breakthrough of deep learning algorithms, which have been widely u...
GPU technology has been improving at an expedited pace in terms of size and performance, empowering ...
Serverless computing is an integral part of the recent success of cloud computing, offering cost and...
The invention of deep belief network (DBN) provides a powerful tool for data modeling. The key advan...
Deep Learning (DL) models have achieved superior performance. Meanwhile, computing hardware like NVI...
To accelerate the inference of machine-learning (ML) model serving, clusters of machines require the...
Deep Learning (DL) methods currently address a variety of complex tasks. GPUs significantly accelera...
Deep learning (DL) training jobs bring some unique challenges to existing cluster managers, such as ...
Serverless computing platforms represent the fastest-growing segment of cloud services and are predi...
Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud,...
Deep learning is an emerging workload in the field of HPC. This powerful method of resolution is abl...