The use of machine learning (ML) inference for various applications is growing drastically. ML inference services engage with users directly, requiring fast and accurate responses. Moreover, these services face dynamic workloads of requests, imposing changes in their computing resources. Failing to right-size computing resources results in either latency service level objectives (SLOs) violations or wasted computing resources. Adapting to dynamic workloads considering all the pillars of accuracy, latency, and resource cost is challenging. In response to these challenges, we propose InfAdapter, which proactively selects a set of ML model variants with their resource allocations to meet latency SLO while maximizing an objective function compo...
National audienceThe emergence of Machine Learning (ML) has increased exponentially in numerous appl...
Machine learning (ML) and statistical techniques are key to transforming big data into actionable kn...
The proliferation of massive datasets combined with the develop-ment of sophisticated analytical tec...
To accelerate the inference of machine-learning (ML) model serving, clusters of machines require the...
Efficiently optimizing multi-model inference pipelines for fast, accurate, and cost-effective infere...
With the advent of ubiquitous deployment of smart devices and the Internet of Things, data sources f...
Machine learning is being deployed in a growing number of applications which demand real- time, accu...
ML systems contend with an ever-growing processing load of physical world data. These systems are ...
Large scale machine learning has many characteristics that can be exploited in the system designs to...
Deep learning (DL) inference has become an essential building block in modern intelligent applicatio...
In recent years, Web services are becoming more and more intelligent (e.g., in understanding user pr...
To serve machine learning requests with trained models plays an increasingly important role with the...
Over the last years, the ever-growing number of Machine Learning(ML) and Artificial Intelligence(AI)...
Developing machine learning (ML) models can be seen as a process similar to the one established for ...
The application of artificial intelligence enhances the ability of sensor and networking technologie...
National audienceThe emergence of Machine Learning (ML) has increased exponentially in numerous appl...
Machine learning (ML) and statistical techniques are key to transforming big data into actionable kn...
The proliferation of massive datasets combined with the develop-ment of sophisticated analytical tec...
To accelerate the inference of machine-learning (ML) model serving, clusters of machines require the...
Efficiently optimizing multi-model inference pipelines for fast, accurate, and cost-effective infere...
With the advent of ubiquitous deployment of smart devices and the Internet of Things, data sources f...
Machine learning is being deployed in a growing number of applications which demand real- time, accu...
ML systems contend with an ever-growing processing load of physical world data. These systems are ...
Large scale machine learning has many characteristics that can be exploited in the system designs to...
Deep learning (DL) inference has become an essential building block in modern intelligent applicatio...
In recent years, Web services are becoming more and more intelligent (e.g., in understanding user pr...
To serve machine learning requests with trained models plays an increasingly important role with the...
Over the last years, the ever-growing number of Machine Learning(ML) and Artificial Intelligence(AI)...
Developing machine learning (ML) models can be seen as a process similar to the one established for ...
The application of artificial intelligence enhances the ability of sensor and networking technologie...
National audienceThe emergence of Machine Learning (ML) has increased exponentially in numerous appl...
Machine learning (ML) and statistical techniques are key to transforming big data into actionable kn...
The proliferation of massive datasets combined with the develop-ment of sophisticated analytical tec...