PhD ThesisModern big data processing systems are becoming very complex in terms of largescale, high-concurrency and multiple talents. Thus, many failures and performance reductions only happen at run-time and are very difficult to capture. Moreover, some issues may only be triggered when some components are executed. To analyze the root cause of these types of issues, we have to capture the dependencies of each component in real-time. Big data processing systems, such as Hadoop and Spark, usually work in large-scale, highly-concurrent, and multi-tenant environments that can easily cause hardware and software malfunctions or failures, thereby leading to performance degradation. Several systems and methods exist to detect big data proc...
Recently emerging software applications are large, complex, distributed and data-intensive, i.e., bi...
The main goal of this thesis is to contribute to the research on automated performance anomaly detec...
Big data processing has recently gained a lot of attention both from academia and industry. The term...
The main goal of this research is to contribute to automated performance anomaly detection for large...
Big data analytics have become widespread as a means to extract knowledge from large datasets. Yet, ...
“Big Data” best characterized by its three features namely “Variety”, “Volume” and “Velocity” i...
This paper analyzes the properties and characteristics of unknown and unexpected faults introduced i...
Although much research has been done to improve the performance of big data systems, predicting the ...
This is a post-peer-review, pre-copyedit version of an article published in Future Generation Comput...
Next generation real-time applications demand big-data infrastructures to process huge and continuou...
A wide spectrum of big data applications in science, engineering, and industry generate large datase...
International audienceThe essential target of ‘Big Data’ technology is to provide new techniques and...
Real-time monitoring of cloud resources is crucial for a variety of tasks such as performance analys...
Applications running in large-scale computing systems such as high performance computing (HPC) or cl...
Performance variability has been acknowledged as a problem for over a decade by cloud practitioners ...
Recently emerging software applications are large, complex, distributed and data-intensive, i.e., bi...
The main goal of this thesis is to contribute to the research on automated performance anomaly detec...
Big data processing has recently gained a lot of attention both from academia and industry. The term...
The main goal of this research is to contribute to automated performance anomaly detection for large...
Big data analytics have become widespread as a means to extract knowledge from large datasets. Yet, ...
“Big Data” best characterized by its three features namely “Variety”, “Volume” and “Velocity” i...
This paper analyzes the properties and characteristics of unknown and unexpected faults introduced i...
Although much research has been done to improve the performance of big data systems, predicting the ...
This is a post-peer-review, pre-copyedit version of an article published in Future Generation Comput...
Next generation real-time applications demand big-data infrastructures to process huge and continuou...
A wide spectrum of big data applications in science, engineering, and industry generate large datase...
International audienceThe essential target of ‘Big Data’ technology is to provide new techniques and...
Real-time monitoring of cloud resources is crucial for a variety of tasks such as performance analys...
Applications running in large-scale computing systems such as high performance computing (HPC) or cl...
Performance variability has been acknowledged as a problem for over a decade by cloud practitioners ...
Recently emerging software applications are large, complex, distributed and data-intensive, i.e., bi...
The main goal of this thesis is to contribute to the research on automated performance anomaly detec...
Big data processing has recently gained a lot of attention both from academia and industry. The term...