The Colony Project is developing operating system and runtime system technology to enable efficient general purpose environments on tens of thousands of processors. To accomplish this, we are investigating memory management techniques, fault management strategies, and parallel resource management schemes. Recent results show promising findings for scalable strategies based on processor virtualization, in-memory checkpointing, and parallel aware modifications to full featured operating systems
Clusters of industry-standard multiprocessors are emerging as a competitive alternative for large-sc...
In the recent era of computing, applications an operating system cannot survive without efficient me...
High availability and performance are two desirable properties for the execution of long-running par...
Traditional full-featured operating systems are known to have properties that limit the scalability ...
Forthcoming massively parallel systems are distributed memory architectures. They consist of several...
With the number of cores on a chip continuing to increase, we are moving towards an era where many-c...
Despite the fact that large-scale shared-memory multiprocessors have been commercially available for...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
Current multi-core design methodologies are facing increasing unpredictability in terms of quality d...
A fundamental problem of parallel computing is that applications often require large-size inst...
We have been pursuing a research program aimed at enhancing productivity and performance in parallel...
This dissertation examines scalability issues in the design of operating systems for largescale, sha...
Petascale supercomputers will be available by 2008. The largest machine of these complex leadership-...
Abstract—As the scale of parallel systems continues to grow, fault management of these systems is be...
Despite the fact that large scale shared-memory multiprocessors have been commercially available for...
Clusters of industry-standard multiprocessors are emerging as a competitive alternative for large-sc...
In the recent era of computing, applications an operating system cannot survive without efficient me...
High availability and performance are two desirable properties for the execution of long-running par...
Traditional full-featured operating systems are known to have properties that limit the scalability ...
Forthcoming massively parallel systems are distributed memory architectures. They consist of several...
With the number of cores on a chip continuing to increase, we are moving towards an era where many-c...
Despite the fact that large-scale shared-memory multiprocessors have been commercially available for...
The multicore era has initiated a move to ubiquitous parallelization of software. In the process, co...
Current multi-core design methodologies are facing increasing unpredictability in terms of quality d...
A fundamental problem of parallel computing is that applications often require large-size inst...
We have been pursuing a research program aimed at enhancing productivity and performance in parallel...
This dissertation examines scalability issues in the design of operating systems for largescale, sha...
Petascale supercomputers will be available by 2008. The largest machine of these complex leadership-...
Abstract—As the scale of parallel systems continues to grow, fault management of these systems is be...
Despite the fact that large scale shared-memory multiprocessors have been commercially available for...
Clusters of industry-standard multiprocessors are emerging as a competitive alternative for large-sc...
In the recent era of computing, applications an operating system cannot survive without efficient me...
High availability and performance are two desirable properties for the execution of long-running par...