Modern multithreaded applications, such as application servers and database engines, can severely stress the performance of user-level memory allocators like the ubiquitous malloc subsystem. Such allocators can prove to be a major scalability impediment for the applications that use them, particularly for applications with large numbers of threads running on high-order multiprocessor systems. This paper introduces Multi-Processor Restartable Critical Sections, or MP-RCS. MP-RCS permits user-level threads to know precisely which processor they are executing on and then to safely manipulate CPU-specific data, such as malloc metadata, without locks or atomic instructions. MP-RCS avoids interference by using upcalls to notify user-level threads...
We present a completely new kind of approach for mapping the computation of an application to MP-SOC...
With speculative thread-level parallelization, codes that cannot be fully compiler-analyzed are aggr...
The FreeBSD project has been engaged in ongoing work to provide scalable support for multi-processor...
As the size of supercomputers increases, the probability of system failure grows substantially, posi...
The microprocessor industry is rapidly moving to the use of multicore chips as general-purpose proce...
Shared memory multiprocessors are considered among the easiest parallel computers to program. Howeve...
Dynamic memory management is one of the most expensive but ubiquitous operations in many C/C++ appli...
Clusters of industry-standard multiprocessors are emerging as a competitive alternative for large-sc...
Network servers make special demands that other types of applications may not make on memory allocat...
grantor: University of TorontoMemory latency is becoming an increasingly important perform...
A thread executing on a simultaneous multithreading (SMT) processor that experiences a long-latency ...
Abstract—A single parallel application running on a multi-core system shows sub-linear speedup becau...
The embedded computing revolution is pushing the transition from a single-core processor to a multic...
The next generation of capability-class, massively parallel processing (MPP) systems is expected to ...
Threads experiencing long-latency loads on a simultaneous multithreading (SMT) processor may clog sh...
We present a completely new kind of approach for mapping the computation of an application to MP-SOC...
With speculative thread-level parallelization, codes that cannot be fully compiler-analyzed are aggr...
The FreeBSD project has been engaged in ongoing work to provide scalable support for multi-processor...
As the size of supercomputers increases, the probability of system failure grows substantially, posi...
The microprocessor industry is rapidly moving to the use of multicore chips as general-purpose proce...
Shared memory multiprocessors are considered among the easiest parallel computers to program. Howeve...
Dynamic memory management is one of the most expensive but ubiquitous operations in many C/C++ appli...
Clusters of industry-standard multiprocessors are emerging as a competitive alternative for large-sc...
Network servers make special demands that other types of applications may not make on memory allocat...
grantor: University of TorontoMemory latency is becoming an increasingly important perform...
A thread executing on a simultaneous multithreading (SMT) processor that experiences a long-latency ...
Abstract—A single parallel application running on a multi-core system shows sub-linear speedup becau...
The embedded computing revolution is pushing the transition from a single-core processor to a multic...
The next generation of capability-class, massively parallel processing (MPP) systems is expected to ...
Threads experiencing long-latency loads on a simultaneous multithreading (SMT) processor may clog sh...
We present a completely new kind of approach for mapping the computation of an application to MP-SOC...
With speculative thread-level parallelization, codes that cannot be fully compiler-analyzed are aggr...
The FreeBSD project has been engaged in ongoing work to provide scalable support for multi-processor...