There are two venues for many-core machines to gain higher performance: increasing the number of processors and number of vector units in one SIMD processor. A truly scalable algorithm should take advantage for both venues. However, most of past research, on scalable memory allocators such as atomic operation based lock-free algorithms, can be scalable with number of processors growing, but have poor scalability with the number of vector units in one SIMD processor growing. As a result, they are not truly scalable in many-core architecture. In this work, we introduce our proposed solution used in the design of XMalloc, an truly scalable, efficient lockfree memory allocator. We will present (1) Our solution for transforming traditional...
Object-oriented programming has long been regarded as too inefficient for SIMD high-performance comp...
As simulation and analytics enter the exascale era, numerical algorithms, particularly implicit solv...
Atomic lock-free multi-word compare-and-swap (MCAS) is a powerful tool for designing concurrent algo...
There are two venues for many-core machines to gain higher performance: increasing the number of pro...
In multicores, performance-critical synchronization is increasingly performed in a lock-free manner ...
In this thesis, we describe two related memory allocators, each with novel properties. PALLOC1 cont...
Ensuring the continuous scaling of parallel applications is challenging on many-core processors, due...
Over the past decade, a pair of instructions called load-linked (LL) and store-conditional (SC) have...
Over the past decade, multicore machines have become the norm. A single machine is capable of having...
The potential of multiprocessor systems is frequently not fully realized by their system services. C...
Over the past decade, a pair of instructions called load-linked (LL) and store-conditional (SC) have...
New and emerging memory technologies combined with enormous growths in data collection and mining wi...
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Compute...
Over the past decade, a pair of instructions called load-linked (LL) and store-conditional (SC) have...
The increase in the number of cores in processors has been an important trend over the past decade. ...
Object-oriented programming has long been regarded as too inefficient for SIMD high-performance comp...
As simulation and analytics enter the exascale era, numerical algorithms, particularly implicit solv...
Atomic lock-free multi-word compare-and-swap (MCAS) is a powerful tool for designing concurrent algo...
There are two venues for many-core machines to gain higher performance: increasing the number of pro...
In multicores, performance-critical synchronization is increasingly performed in a lock-free manner ...
In this thesis, we describe two related memory allocators, each with novel properties. PALLOC1 cont...
Ensuring the continuous scaling of parallel applications is challenging on many-core processors, due...
Over the past decade, a pair of instructions called load-linked (LL) and store-conditional (SC) have...
Over the past decade, multicore machines have become the norm. A single machine is capable of having...
The potential of multiprocessor systems is frequently not fully realized by their system services. C...
Over the past decade, a pair of instructions called load-linked (LL) and store-conditional (SC) have...
New and emerging memory technologies combined with enormous growths in data collection and mining wi...
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Compute...
Over the past decade, a pair of instructions called load-linked (LL) and store-conditional (SC) have...
The increase in the number of cores in processors has been an important trend over the past decade. ...
Object-oriented programming has long been regarded as too inefficient for SIMD high-performance comp...
As simulation and analytics enter the exascale era, numerical algorithms, particularly implicit solv...
Atomic lock-free multi-word compare-and-swap (MCAS) is a powerful tool for designing concurrent algo...