Abstract—As the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose processors keeps rising, on-chip communication becomes more and more important. In order to write efficient programs for these architectures it is therefore necessary to have a good idea of the communication behavior of an application. We present a communication profiler that extracts this behavior from com-piled, sequential or parallel C/C++ programs, and constructs a dynamic data-flow graph at the level of major functional blocks. In contrast to existing methods of measuring inter-program communication, our tool automatically generates the program’s data-flow graph and is less demanding for the developer. It can also be used to view differ...
Executions of modern parallel programs often yield complex communications among compute nodes of lar...
In this paper we describe a compiler framework which can identify communication patterns for MPI-bas...
Recent trends show a steady increase in the utilization of heterogeneous multicore architectures in ...
While the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose proce...
While the number of cores in both embedded MultiProcessor Systems-on-Chip and general purpose proces...
Trying to make use of the inherent parallelization offered by multicore architectures, partitioning ...
The growing demand of processing power is being satisfied mainly by an increase in the number of hom...
Though transistor scaling yields more transistors per chip, however, the consistent performance gain...
Embedded system synthesis, multiprocessor synthesis, and thread assignment policy design all require...
The performance of massively parallel program is often impacted by the cost of communication across ...
Multicomputer (distributed memory MIMD machines) have emerged as inexpensive, yet powerful parallel...
International audienceModern parallel computing platforms exhibit substantialvariation in communicat...
Moving data between processes has often been discussed as one of the major bottlenecks in parallel c...
Current and future supercomputers have tens of thousands of compute nodes interconnected with high-d...
International audienceThe task graph of telecommunication applications often exhibits massive coarse...
Executions of modern parallel programs often yield complex communications among compute nodes of lar...
In this paper we describe a compiler framework which can identify communication patterns for MPI-bas...
Recent trends show a steady increase in the utilization of heterogeneous multicore architectures in ...
While the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose proce...
While the number of cores in both embedded MultiProcessor Systems-on-Chip and general purpose proces...
Trying to make use of the inherent parallelization offered by multicore architectures, partitioning ...
The growing demand of processing power is being satisfied mainly by an increase in the number of hom...
Though transistor scaling yields more transistors per chip, however, the consistent performance gain...
Embedded system synthesis, multiprocessor synthesis, and thread assignment policy design all require...
The performance of massively parallel program is often impacted by the cost of communication across ...
Multicomputer (distributed memory MIMD machines) have emerged as inexpensive, yet powerful parallel...
International audienceModern parallel computing platforms exhibit substantialvariation in communicat...
Moving data between processes has often been discussed as one of the major bottlenecks in parallel c...
Current and future supercomputers have tens of thousands of compute nodes interconnected with high-d...
International audienceThe task graph of telecommunication applications often exhibits massive coarse...
Executions of modern parallel programs often yield complex communications among compute nodes of lar...
In this paper we describe a compiler framework which can identify communication patterns for MPI-bas...
Recent trends show a steady increase in the utilization of heterogeneous multicore architectures in ...