Process/thread migration and checkpointing are indis-pensable for resource sharing, cycle stealing, and other modes of interaction. To provide a flexible, transparent, and portable solution in heterogeneous environments, we have developed a multi-grained migration/checkpointing package, MigThread, which can migrate/checkpoint multi-ple threads to different machines or file systems simultane-ously, and also perform single coarse-grained process mi-gration/checkpointing. For scalability and portability, com-putation states are extracted out of their original places and abstracted to the language level. With the user-level stack/heap management, MigThread does not rely on any thread libraries and operating systems. For heterogeneity, a novel d...
Thread migration is established as a mechanism for achieving dynamic load sharing and data locality....
Heterogeneity in general-purpose workloads often end up in non optimal per-thread hardware resource ...
This dissertation describes the design, implementation, and performance of two mechanisms that addre...
Process/thread migration and checkpointing schemes support load balancing, load sharing and fault to...
Thread migration moves a single call-stack to another machine to improve either load balancing or lo...
This paper describes a generic mechanism to migrate threads in heterogeneous distributed environment...
This paper describes an alternaLive technique to provide multithreading in an enhanced C language. I...
Distributed Shared Memory (DSM) systems provide a logically shared memory over physically distribute...
Checkpointing can be used to adapt resource utilization in heterogeneous distributed environments. I...
Checkpointing of parallel applications can be used as the core technology to provide process migrati...
Checkpointing in a homogeneous environment, where both checkpointing and recovery are performed on t...
A lot of research has been done on fault-tolerance for MPI applications, some on checkpoint/restart,...
Multiple threads running in a single, shared address space is a simple model for writing parallel pr...
Process checkpointing is a basic mechanism required for providing High Throughput Computing service ...
Thread migration is established as a mechanism for achieving dynamic load sharing and data locality....
Thread migration is established as a mechanism for achieving dynamic load sharing and data locality....
Heterogeneity in general-purpose workloads often end up in non optimal per-thread hardware resource ...
This dissertation describes the design, implementation, and performance of two mechanisms that addre...
Process/thread migration and checkpointing schemes support load balancing, load sharing and fault to...
Thread migration moves a single call-stack to another machine to improve either load balancing or lo...
This paper describes a generic mechanism to migrate threads in heterogeneous distributed environment...
This paper describes an alternaLive technique to provide multithreading in an enhanced C language. I...
Distributed Shared Memory (DSM) systems provide a logically shared memory over physically distribute...
Checkpointing can be used to adapt resource utilization in heterogeneous distributed environments. I...
Checkpointing of parallel applications can be used as the core technology to provide process migrati...
Checkpointing in a homogeneous environment, where both checkpointing and recovery are performed on t...
A lot of research has been done on fault-tolerance for MPI applications, some on checkpoint/restart,...
Multiple threads running in a single, shared address space is a simple model for writing parallel pr...
Process checkpointing is a basic mechanism required for providing High Throughput Computing service ...
Thread migration is established as a mechanism for achieving dynamic load sharing and data locality....
Thread migration is established as a mechanism for achieving dynamic load sharing and data locality....
Heterogeneity in general-purpose workloads often end up in non optimal per-thread hardware resource ...
This dissertation describes the design, implementation, and performance of two mechanisms that addre...