This paper describes initial results for an architecture called the Shared-Thread Multiprocessor (STMP). The STMP com-bines features of a multithreaded processor and a chip mul-tiprocessor; specically, it enables distinct cores on a chip multiprocessor to share thread state. This shared thread state allows the system to schedule threads from a shared pool onto individual cores, allowing for rapid movement of threads between cores. This paper demonstrates and evaluates three benets of this architecture: (1) By providing more thread state stor-age than available in the cores themselves, the architecture enjoys the ILP benets of many threads, but carries the in-core complexity of supporting just a few. (2) Threads can move between cores fast e...
At the level of multi-core processors that share the same cache, data sharing among threads which be...
Simultaneous multithreading (SMT) allows multiple hardware threads to execute concurrently on a proc...
Moving threads is a theoretically interesting approach for mapping the computation of an application...
Moving threads is a new kind of approach for multicore processor architectures. Traditionally, each ...
We present a user-level thread scheduler for shared-memory multiprocessors, and we analyze its perfo...
Existing multiprocessor synchronization mechanisms are relatively heavyweight, due in part to the le...
We present a user-level thread scheduler for shared-memory multiprocessors, and we analyze its perfo...
Moving threads is a new kind of approach for mapping the computation of an application to multiproce...
Moving threads is a new kind of approach for mapping the computation of an application to multiproce...
This thesis implements a fast multi-threaded shared memory multiprocessor scheduler that runs on Lin...
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruc...
Multithreaded processors are an attractive alternative to superscalar processors. Their ability to h...
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruc...
This paper evaluates new techniques to improve performance and efficiency of Chip MultiProcessors (C...
Chip-level multiprocessors (CMP) have multiple processing cores (Cores) and generally have their cac...
At the level of multi-core processors that share the same cache, data sharing among threads which be...
Simultaneous multithreading (SMT) allows multiple hardware threads to execute concurrently on a proc...
Moving threads is a theoretically interesting approach for mapping the computation of an application...
Moving threads is a new kind of approach for multicore processor architectures. Traditionally, each ...
We present a user-level thread scheduler for shared-memory multiprocessors, and we analyze its perfo...
Existing multiprocessor synchronization mechanisms are relatively heavyweight, due in part to the le...
We present a user-level thread scheduler for shared-memory multiprocessors, and we analyze its perfo...
Moving threads is a new kind of approach for mapping the computation of an application to multiproce...
Moving threads is a new kind of approach for mapping the computation of an application to multiproce...
This thesis implements a fast multi-threaded shared memory multiprocessor scheduler that runs on Lin...
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruc...
Multithreaded processors are an attractive alternative to superscalar processors. Their ability to h...
To achieve high performance, contemporary computer systems rely on two forms of parallelism: instruc...
This paper evaluates new techniques to improve performance and efficiency of Chip MultiProcessors (C...
Chip-level multiprocessors (CMP) have multiple processing cores (Cores) and generally have their cac...
At the level of multi-core processors that share the same cache, data sharing among threads which be...
Simultaneous multithreading (SMT) allows multiple hardware threads to execute concurrently on a proc...
Moving threads is a theoretically interesting approach for mapping the computation of an application...