Increasing computational demand of simulations motivates the use of parallel computing systems. At the same time, this parallelism poses challenges to application developers. The Message Passing Interface (MPI) is a de-facto standard for distributed memory programming in high performance computing. However, its use also enables complex parallel programing errors such as races, communication errors, and deadlocks. Automatic tools can assist application developers in the detection and removal of such errors. This thesis considers tools that detect such errors during an application run and advances them towards a combination of both precise checks (neither false positives nor false negatives) and scalability. This includes novel hierarchical c...
Faults have become the norm rather than the exception for high-end computing on clusters with 10s/10...
The trend towards many-core multi-processor systems and clusters will make systems with tens and hun...
The demand for ever-growing computing capabilities in scientific computing and simulation has led to...
Increasing computational demand of simulations motivates the use of parallel computing systems. At t...
The Message-Passing Interface (MPI) is large and complex. Therefore, programming MPI is error prone....
Deadlock detection is one of the main issues of software testing in High Performance Computing (HPC)...
Deadlock detection is one of the main issues of software testing in High Performance Computing (HPC)...
MPI is the de-facto standard message-passing based parallel programming model. However, the bug dete...
The Message Passing Interface (MPI) is the standard API for parallelization in high-performance and ...
The Message Passing Interface (MPI) is the de-facto standard for distributed memory computing in hig...
Abstract. Message Passing Interfaces (MPI) plays an important role in parallel computing. Many paral...
International audienceHigh-Performance Computing (HPC) is currently facing significant challenges. T...
technical reportMessage Passing Interface is a widely used standard in the High Performance and Sci...
Abstract. The Message Passing Interface (MPI) is widely used to write parallel programs using messag...
Fault tolerance in parallel systems has traditionally been achieved through a combination of redunda...
Faults have become the norm rather than the exception for high-end computing on clusters with 10s/10...
The trend towards many-core multi-processor systems and clusters will make systems with tens and hun...
The demand for ever-growing computing capabilities in scientific computing and simulation has led to...
Increasing computational demand of simulations motivates the use of parallel computing systems. At t...
The Message-Passing Interface (MPI) is large and complex. Therefore, programming MPI is error prone....
Deadlock detection is one of the main issues of software testing in High Performance Computing (HPC)...
Deadlock detection is one of the main issues of software testing in High Performance Computing (HPC)...
MPI is the de-facto standard message-passing based parallel programming model. However, the bug dete...
The Message Passing Interface (MPI) is the standard API for parallelization in high-performance and ...
The Message Passing Interface (MPI) is the de-facto standard for distributed memory computing in hig...
Abstract. Message Passing Interfaces (MPI) plays an important role in parallel computing. Many paral...
International audienceHigh-Performance Computing (HPC) is currently facing significant challenges. T...
technical reportMessage Passing Interface is a widely used standard in the High Performance and Sci...
Abstract. The Message Passing Interface (MPI) is widely used to write parallel programs using messag...
Fault tolerance in parallel systems has traditionally been achieved through a combination of redunda...
Faults have become the norm rather than the exception for high-end computing on clusters with 10s/10...
The trend towards many-core multi-processor systems and clusters will make systems with tens and hun...
The demand for ever-growing computing capabilities in scientific computing and simulation has led to...