156 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2007.The self-diagnosing capability of our service comes from the scalable learning-based performance problem diagnosis techniques we propose. The increasing complexity of systems has motivated design of machine learning approaches to automate some system management tasks. However, with increase in scale, current approaches suffer from serious scalability issues. We present two scalable learning-based techniques that automatically identify probable causes of performance problems in large server systems with multiple tiers and replicated sites. By incorporating a large number of diagnostic information sources using a temporal segmentation mechanism and applying transfer learni...
This thesis examines the design, implementation and performance of a scalable analysis platform for ...
More than ever, businesses heavily rely on IT service delivery to meet their current and frequently ...
Abstract—In this paper, we present an automated on-line ser-vice for troubleshooting performance pro...
system performance diagnosis, machine learning, transfer learning, scalability Distributed systems c...
Providing contractual performance assurances in distributed systems is an important and challenging ...
<p>Large production systems are susceptible to chronic performance problems where the system still w...
[[abstract]]It is important to keep an information system work properly with efficient performance i...
Diagnosing performance problems in modern datacenters and distributed systems is challenging, as the...
Distributed systems form an integral part of human life—from ATMs to the Domain Name Service. Typica...
Cloud datacenters comprise hundreds or thousands of disparate application services, each having stri...
For dependability outages in distributed internet infrastructures, it is often not enough to detect ...
In this paper, we address the problem of efficient diagnosis in real-time systems capable of on-line...
Understanding server capacity is crucial for system ca-pacity planning, configuration, and QoS-aware...
Today's Internet datacenters run many complex and large-scale Web applications that are very difficu...
Abstract. For dependability outages in distributed internet infrastructures, it is often not enough ...
This thesis examines the design, implementation and performance of a scalable analysis platform for ...
More than ever, businesses heavily rely on IT service delivery to meet their current and frequently ...
Abstract—In this paper, we present an automated on-line ser-vice for troubleshooting performance pro...
system performance diagnosis, machine learning, transfer learning, scalability Distributed systems c...
Providing contractual performance assurances in distributed systems is an important and challenging ...
<p>Large production systems are susceptible to chronic performance problems where the system still w...
[[abstract]]It is important to keep an information system work properly with efficient performance i...
Diagnosing performance problems in modern datacenters and distributed systems is challenging, as the...
Distributed systems form an integral part of human life—from ATMs to the Domain Name Service. Typica...
Cloud datacenters comprise hundreds or thousands of disparate application services, each having stri...
For dependability outages in distributed internet infrastructures, it is often not enough to detect ...
In this paper, we address the problem of efficient diagnosis in real-time systems capable of on-line...
Understanding server capacity is crucial for system ca-pacity planning, configuration, and QoS-aware...
Today's Internet datacenters run many complex and large-scale Web applications that are very difficu...
Abstract. For dependability outages in distributed internet infrastructures, it is often not enough ...
This thesis examines the design, implementation and performance of a scalable analysis platform for ...
More than ever, businesses heavily rely on IT service delivery to meet their current and frequently ...
Abstract—In this paper, we present an automated on-line ser-vice for troubleshooting performance pro...