This paper studies the data locality of the work-stealing scheduling algorithm on hardware-controlled shared-memory machines, where movement of data to and from the cache is solely controlled by the hardware. We present lower and upper bounds on the number of cache misses when using work stealing, and introduce a locality-guided work-stealing algorithm and its experimental validation. As a lower bound, we show that a work-stealing application that exhibits good data locality on a uniprocessor may exhibit poor data locality on a multiprocessor. In particular, we show a family of multithreaded computations ¢ ¡ whose members perform £¥¤§¦© ¨ operations (work) and incur a constant number of cache misses on a uniprocessor, while even on two proc...
We present an adaptive work-stealing thread scheduler, A-STEAL, for fork-join multithreaded jobs, li...
Lightweight threads have become a common abstraction in the field of programming languages and opera...
Computational task DAGs are executed on parallel computers by a task scheduling algorithm. Intellige...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
This paper investigates a variant of the work-stealing algorithm that we call the localized work-ste...
Blumofe and Leiserson [6] gave the first provably good work-stealing work scheduler for mul-tithread...
Blumofe and Leiserson [6] gave the first provably good work-stealing work scheduler for mul-tithread...
Load balancing is a technique which allows efficient parallelization of irregular workloads, and a k...
Abstract—This paper analyzes the overhead due to false sharing when parallel tasks are scheduled usi...
This paper addresses the problem of efficiently supporting parallelism within a managed runtime. A p...
The fork-join paradigm of concurrent expression has gained popularity in conjunction with work-steal...
We present a work-stealing algorithm for total-store memory architectures, such as Intel's X86, that...
Abstract. We present a work-stealing algorithm for runtime scheduling of data-parallel operations in...
Abstract. We present a work-stealing algorithm for runtime scheduling of data-parallel operations in...
Work-stealing is a promising approach for effectively exploiting software parallelism on parallel ha...
We present an adaptive work-stealing thread scheduler, A-STEAL, for fork-join multithreaded jobs, li...
Lightweight threads have become a common abstraction in the field of programming languages and opera...
Computational task DAGs are executed on parallel computers by a task scheduling algorithm. Intellige...
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Com...
This paper investigates a variant of the work-stealing algorithm that we call the localized work-ste...
Blumofe and Leiserson [6] gave the first provably good work-stealing work scheduler for mul-tithread...
Blumofe and Leiserson [6] gave the first provably good work-stealing work scheduler for mul-tithread...
Load balancing is a technique which allows efficient parallelization of irregular workloads, and a k...
Abstract—This paper analyzes the overhead due to false sharing when parallel tasks are scheduled usi...
This paper addresses the problem of efficiently supporting parallelism within a managed runtime. A p...
The fork-join paradigm of concurrent expression has gained popularity in conjunction with work-steal...
We present a work-stealing algorithm for total-store memory architectures, such as Intel's X86, that...
Abstract. We present a work-stealing algorithm for runtime scheduling of data-parallel operations in...
Abstract. We present a work-stealing algorithm for runtime scheduling of data-parallel operations in...
Work-stealing is a promising approach for effectively exploiting software parallelism on parallel ha...
We present an adaptive work-stealing thread scheduler, A-STEAL, for fork-join multithreaded jobs, li...
Lightweight threads have become a common abstraction in the field of programming languages and opera...
Computational task DAGs are executed on parallel computers by a task scheduling algorithm. Intellige...