In this paper, we introduce a model for managing abstract data structures that map to arbitrary distributed memory architectures. It is difficult to achieve scalable performance in data-parallel applications where the programmer manipulates abstract data structures rather than directly manipulating memory. On distributed memory architectures such abstract data-parallel operations may require communication between nodes. Therefore, the underlying system has to handle communication efficiently without any help from the user. Our data model splits data blocks into two sets -- local data and remote data -- and schedules the sub-block by availability at runtime.We implement the described model in DistNumPy -- a high-productivity programming libr...
Programming nonshared memory systems is more difficult than programming shared memory systems, since...
We present the design and implementation of a parallel algorithm for computing Gröbner bases on dist...
Scientists increasingly rely on Python tools to perform scalable distributed memory arrayoperations ...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
Running programs across multiple nodes in a cluster of networked computers, such as in a supercomput...
The data-parallel language High Performance Fortran (HPF) does not allow efficient expression of mix...
Modern open source high-level languages such as R and Python are.increasingly playing an important r...
We consider the problem of efficiently coupling multiple data-parallel programs at runtime. We propo...
Advances in computing and networking infrastructure have enabled an increasing number of application...
Tools commonly leveraged to tackle large-scale data science workflows have traditionally shied away ...
In this paper, we introduce DistNumPy, a library for doing numeri-cal computation in Python that tar...
. Distributed data structures are those that are shared by parallel processes. They provide a flexib...
Distributed memory multiprocessor architectures offer enormous computational power, by exploiting th...
The data science community today has embraced the concept of Dataframes as the de facto standard for...
Parallel programming is hard and programmers still struggle to write code for shared memory multicor...
Programming nonshared memory systems is more difficult than programming shared memory systems, since...
We present the design and implementation of a parallel algorithm for computing Gröbner bases on dist...
Scientists increasingly rely on Python tools to perform scalable distributed memory arrayoperations ...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
Running programs across multiple nodes in a cluster of networked computers, such as in a supercomput...
The data-parallel language High Performance Fortran (HPF) does not allow efficient expression of mix...
Modern open source high-level languages such as R and Python are.increasingly playing an important r...
We consider the problem of efficiently coupling multiple data-parallel programs at runtime. We propo...
Advances in computing and networking infrastructure have enabled an increasing number of application...
Tools commonly leveraged to tackle large-scale data science workflows have traditionally shied away ...
In this paper, we introduce DistNumPy, a library for doing numeri-cal computation in Python that tar...
. Distributed data structures are those that are shared by parallel processes. They provide a flexib...
Distributed memory multiprocessor architectures offer enormous computational power, by exploiting th...
The data science community today has embraced the concept of Dataframes as the de facto standard for...
Parallel programming is hard and programmers still struggle to write code for shared memory multicor...
Programming nonshared memory systems is more difficult than programming shared memory systems, since...
We present the design and implementation of a parallel algorithm for computing Gröbner bases on dist...
Scientists increasingly rely on Python tools to perform scalable distributed memory arrayoperations ...