This paper introduces Pica, a fine-grain, message passing architecture designed to efficiently support high-throughput parallel applications. The architecture minimizes overhead for basic parallel operations. An operand-addressed context cache and round-robin task manager allow single cycle task swaps. Fixed-sized activation contexts simplify storage management. Wordtag synchronization bits provide low-cost synchronization. The focus on high-throughput applications allows a small local memory (1024 36-bit words). A complete node (including memory) can be implemented using a fraction of a chip. A multi-node chip prototype (four nodes/chip) is being designed. In order to meet chip I/O requirements, a high-bandwidth, threedimensional optical n...
New national security demands require enhanced computing systems for nearly ab initio simulations of...
Current copper backplane technology has reached the technical limits of clock speed and width for sy...
Processing in memory (PIM) moves computation into memories with the goal of improving throughput and...
Abstract—This paper describes Pica, a fine-grain, message-passing architecture designed to efficient...
The Problem: There is a need for computer systems which can provide large amounts of computing power...
© ASEE 2009The Pico processor is a scaled down RISC processor hence the name “Pico”. Pico processors...
In this paper, we show how 3D stacking technology can be used to implement a simple, low-power, high...
Data movement has become a limiting factor in terms of performance, power consumption, and scalabili...
This paper describes the design goals, micro-architecture, and implementation of the microprogrammed...
Technological frontiers between semiconductor technology, packaging, and system design are disappear...
AbstractUK based picoChip Design's new PC101 is a huge parallel device integrating 430 16-bit proces...
This paper examines the computing power of optical parallel computer systems. We consider t proposed...
The paper presents PowerMANNA- a distributed-memory parallel computer system based on the 64-Bit Pow...
In this paper, we show how 3D stacking technology can be used to implement a simple, low-power, high...
This paper presents the design and implementation of an efficient communication system, Pupa, devel...
New national security demands require enhanced computing systems for nearly ab initio simulations of...
Current copper backplane technology has reached the technical limits of clock speed and width for sy...
Processing in memory (PIM) moves computation into memories with the goal of improving throughput and...
Abstract—This paper describes Pica, a fine-grain, message-passing architecture designed to efficient...
The Problem: There is a need for computer systems which can provide large amounts of computing power...
© ASEE 2009The Pico processor is a scaled down RISC processor hence the name “Pico”. Pico processors...
In this paper, we show how 3D stacking technology can be used to implement a simple, low-power, high...
Data movement has become a limiting factor in terms of performance, power consumption, and scalabili...
This paper describes the design goals, micro-architecture, and implementation of the microprogrammed...
Technological frontiers between semiconductor technology, packaging, and system design are disappear...
AbstractUK based picoChip Design's new PC101 is a huge parallel device integrating 430 16-bit proces...
This paper examines the computing power of optical parallel computer systems. We consider t proposed...
The paper presents PowerMANNA- a distributed-memory parallel computer system based on the 64-Bit Pow...
In this paper, we show how 3D stacking technology can be used to implement a simple, low-power, high...
This paper presents the design and implementation of an efficient communication system, Pupa, devel...
New national security demands require enhanced computing systems for nearly ab initio simulations of...
Current copper backplane technology has reached the technical limits of clock speed and width for sy...
Processing in memory (PIM) moves computation into memories with the goal of improving throughput and...