A trend in parallel computer architecture is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important. The focus of the present thesis is to study multithreaded PDE...
Abstract. OpenMP has become the dominant standard for shared memory pro-gramming. It is traditionall...
Non-uniform memory access (NUMA) architectures are modern shared-memory, multi-core machines offerin...
Although there exist several approaches to rapidly solving the N-body problem, and a diversity of im...
The current trend in parallel computers is that systems with a large shared memory are becoming more...
International audienceWe introduce shared-memory parallelism in a parallel distributed-memory solver...
Scientific computing is used frequently in an increasing number of disciplines to accelerate scienti...
Data locality is a well-recognized requirement for the development of any parallel application, but ...
Computer simulations that solve partial differential equations (PDEs) are common in many fields of s...
Cache Coherent Non-Uniform Memory Access (CC-NUMA) architectures have received strong interests from...
Due to their excellent price-performance ratio, clusters built from commodity nodes have become broa...
A common feature of many scalable parallel machines is non-uniform memory access (NUMA) --- data acc...
In scalable multiprocessor architectures, the times required for a processor to access various porti...
This paper compares data distribution methodologies for scaling the performance of OpenMP on NUMA ar...
Locality of computation is key to obtaining high performance on a broad variety of parallel architec...
The gap between processor speed and memory latency has led to the use of caches in the memory system...
Abstract. OpenMP has become the dominant standard for shared memory pro-gramming. It is traditionall...
Non-uniform memory access (NUMA) architectures are modern shared-memory, multi-core machines offerin...
Although there exist several approaches to rapidly solving the N-body problem, and a diversity of im...
The current trend in parallel computers is that systems with a large shared memory are becoming more...
International audienceWe introduce shared-memory parallelism in a parallel distributed-memory solver...
Scientific computing is used frequently in an increasing number of disciplines to accelerate scienti...
Data locality is a well-recognized requirement for the development of any parallel application, but ...
Computer simulations that solve partial differential equations (PDEs) are common in many fields of s...
Cache Coherent Non-Uniform Memory Access (CC-NUMA) architectures have received strong interests from...
Due to their excellent price-performance ratio, clusters built from commodity nodes have become broa...
A common feature of many scalable parallel machines is non-uniform memory access (NUMA) --- data acc...
In scalable multiprocessor architectures, the times required for a processor to access various porti...
This paper compares data distribution methodologies for scaling the performance of OpenMP on NUMA ar...
Locality of computation is key to obtaining high performance on a broad variety of parallel architec...
The gap between processor speed and memory latency has led to the use of caches in the memory system...
Abstract. OpenMP has become the dominant standard for shared memory pro-gramming. It is traditionall...
Non-uniform memory access (NUMA) architectures are modern shared-memory, multi-core machines offerin...
Although there exist several approaches to rapidly solving the N-body problem, and a diversity of im...