The BlueGene/L supercomputer will use system-on-a-chip integration and a highly scalable cellular architecture to deliver 360 Teraflops of peak computing power. With 65,536 compute nodes, BlueGene/L represents a new level of scalability for parallel systems. As such, it is natural for many scalability challenges to arise. In this paper, we discuss challenges in the area of system management and control, including machine booting, software installation, user account management, system monitoring, and job exe-cution. We address the issue of scalability by organizing the system hierarchically. The 65,536 compute nodes are or-ganized in 1,024 clusters of 64 compute nodes each, called processing sets. Each processing set is under control of a 65...
Abstract. The Resource and Job Management System (RJMS) is the middleware in charge of de-livering c...
Systems administrators of large clusters often need to perform the same administrative activity hund...
In the past decade there has been a dramatic shift from mainframe or ‘host-centric ’ computing to a ...
Abstract. The Blue Gene/L supercomputer will use system-on-a-chip integration and a highly scalable ...
supercomputer As 1999 ended, IBM announced its intention to construct a onepetaflop supercomputer. T...
ABSTRACT. BlueGene/P (BG/P) is the second generation BlueGene architecture from IBM, succeeding Blue...
In the last decades, high-performance large-scale systems have been a fundamental tool for scientifi...
Operating Systems have been considered as a cor-nerstone of the modern computer system, and the con-...
Scalable management of distributed resources is one of the major challenges in deployment of large-s...
Abstract. The BlueGene/L supercoputer, with 65,536 dual-processor compute nodes, was designed from t...
Today’s top high performance computing systems run ap-plications with hundreds of thousands of proce...
Traditional full-featured operating systems are known to have properties that limit the scalability ...
Large-scale systems like BlueGene/L are susceptible to a number of software and hardware failures th...
This dissertation examines scalability issues in the design of operating systems for largescale, sha...
Approaches to supercomputer architecture have taken dramatic turns since the earliest systems were i...
Abstract. The Resource and Job Management System (RJMS) is the middleware in charge of de-livering c...
Systems administrators of large clusters often need to perform the same administrative activity hund...
In the past decade there has been a dramatic shift from mainframe or ‘host-centric ’ computing to a ...
Abstract. The Blue Gene/L supercomputer will use system-on-a-chip integration and a highly scalable ...
supercomputer As 1999 ended, IBM announced its intention to construct a onepetaflop supercomputer. T...
ABSTRACT. BlueGene/P (BG/P) is the second generation BlueGene architecture from IBM, succeeding Blue...
In the last decades, high-performance large-scale systems have been a fundamental tool for scientifi...
Operating Systems have been considered as a cor-nerstone of the modern computer system, and the con-...
Scalable management of distributed resources is one of the major challenges in deployment of large-s...
Abstract. The BlueGene/L supercoputer, with 65,536 dual-processor compute nodes, was designed from t...
Today’s top high performance computing systems run ap-plications with hundreds of thousands of proce...
Traditional full-featured operating systems are known to have properties that limit the scalability ...
Large-scale systems like BlueGene/L are susceptible to a number of software and hardware failures th...
This dissertation examines scalability issues in the design of operating systems for largescale, sha...
Approaches to supercomputer architecture have taken dramatic turns since the earliest systems were i...
Abstract. The Resource and Job Management System (RJMS) is the middleware in charge of de-livering c...
Systems administrators of large clusters often need to perform the same administrative activity hund...
In the past decade there has been a dramatic shift from mainframe or ‘host-centric ’ computing to a ...