Users quality of experience on web systems are largely determined by the tail latency, e.g., 95th percentile. Scaling resources along, e.g., the number of virtual cores per VM, is shown to be effective to meet the average latency but falls short in taming the latency tail in the cloud where the performance variability is higher. The prior art shows the prominence of increasing the request redundancy to curtail the latency either in the off-line setting or without scaling-in cores of virtual machines. In this paper, we propose an opportunistic scaler, termed SmallTail, which aims to achieve stringent targets of tail latency while provisioning a minimum amount of resources and keeping them well utilized. Against dynamic workloads, SmallTail s...
Many Web applications are now hosted in elastic cloud en-vironments where the unit of resource alloc...
Many Web applications are now hosted in elastic cloud en-vironments where the unit of resource alloc...
An essential requirement of cloud computing or data centers is to simultaneously achieve good perfor...
Wimpy virtual instances equipped with small numbers of cores and RAM are popular public and private ...
Response time variability in software applications can severely degrade the quality of the user expe...
Offering consistent low latency remains a key challenge for distributed applications, especially whe...
Replicating redundant requests has been shown to be an effective mechanism to defend application per...
Interactive services such as Web search, recommendations, games, and finance must respond quickly ...
Prefetching and caching are techniques commonly used in I/O systems to reduce latency. Many research...
Interactive services, such as Web search, recommendations, games, and finance, must respond quickly ...
We investigate the techniques necessary for building highly-available, low-cost, scalable servers, s...
Workload scaling is an approach to accelerating computation and thus improving response times by rep...
A major theme of IT in the past decade has been the shift from on-premise hardware to cloud computin...
Processing time variability is commonplace in distributed systems, where resources display disparate...
Prefetching and caching are techniques commonly used in I/O systems to reduce latency. Many research...
Many Web applications are now hosted in elastic cloud en-vironments where the unit of resource alloc...
Many Web applications are now hosted in elastic cloud en-vironments where the unit of resource alloc...
An essential requirement of cloud computing or data centers is to simultaneously achieve good perfor...
Wimpy virtual instances equipped with small numbers of cores and RAM are popular public and private ...
Response time variability in software applications can severely degrade the quality of the user expe...
Offering consistent low latency remains a key challenge for distributed applications, especially whe...
Replicating redundant requests has been shown to be an effective mechanism to defend application per...
Interactive services such as Web search, recommendations, games, and finance must respond quickly ...
Prefetching and caching are techniques commonly used in I/O systems to reduce latency. Many research...
Interactive services, such as Web search, recommendations, games, and finance, must respond quickly ...
We investigate the techniques necessary for building highly-available, low-cost, scalable servers, s...
Workload scaling is an approach to accelerating computation and thus improving response times by rep...
A major theme of IT in the past decade has been the shift from on-premise hardware to cloud computin...
Processing time variability is commonplace in distributed systems, where resources display disparate...
Prefetching and caching are techniques commonly used in I/O systems to reduce latency. Many research...
Many Web applications are now hosted in elastic cloud en-vironments where the unit of resource alloc...
Many Web applications are now hosted in elastic cloud en-vironments where the unit of resource alloc...
An essential requirement of cloud computing or data centers is to simultaneously achieve good perfor...