While the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose processors keeps rising, on-chip communication becomes more and more important. In order to write efficient programs for these architectures, it is therefore necessary to have a good idea of the communication behavior of an application. We present a communication profiler that extracts this behavior from compiled, parallel or sequential C/C++ programs, and constructs a dynamic data-flow graph at the level of major functional blocks. In contrast to existing methods of measuring inter-program communication, our tool automatically generates the program's data-flow graph and is less demanding for the developer. It can also be used to view differences...
International audienceThe task graph of telecommunication applications often exhibits massive coarse...
In this paper we describe a compiler framework which can identify communication patterns for MPI-bas...
Current and future supercomputers have tens of thousands of compute nodes interconnected with high-d...
While the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose proce...
Abstract—As the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose...
While the number of cores in both embedded MultiProcessor Systems-on-Chip and general purpose proces...
Trying to make use of the inherent parallelization offered by multicore architectures, partitioning ...
The growing demand of processing power is being satisfied mainly by an increase in the number of hom...
Multicomputer (distributed memory MIMD machines) have emerged as inexpensive, yet powerful parallel...
Though transistor scaling yields more transistors per chip, however, the consistent performance gain...
The performance of massively parallel program is often impacted by the cost of communication across ...
Embedded system synthesis, multiprocessor synthesis, and thread assignment policy design all require...
I have developed a simple program which exercises communication in concurrent processor operating sy...
Moving data between processes has often been discussed as one of the major bottlenecks in parallel c...
International audienceModern parallel computing platforms exhibit substantialvariation in communicat...
International audienceThe task graph of telecommunication applications often exhibits massive coarse...
In this paper we describe a compiler framework which can identify communication patterns for MPI-bas...
Current and future supercomputers have tens of thousands of compute nodes interconnected with high-d...
While the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose proce...
Abstract—As the number of cores in both embedded Multi-Processor Systems-on-Chip and general purpose...
While the number of cores in both embedded MultiProcessor Systems-on-Chip and general purpose proces...
Trying to make use of the inherent parallelization offered by multicore architectures, partitioning ...
The growing demand of processing power is being satisfied mainly by an increase in the number of hom...
Multicomputer (distributed memory MIMD machines) have emerged as inexpensive, yet powerful parallel...
Though transistor scaling yields more transistors per chip, however, the consistent performance gain...
The performance of massively parallel program is often impacted by the cost of communication across ...
Embedded system synthesis, multiprocessor synthesis, and thread assignment policy design all require...
I have developed a simple program which exercises communication in concurrent processor operating sy...
Moving data between processes has often been discussed as one of the major bottlenecks in parallel c...
International audienceModern parallel computing platforms exhibit substantialvariation in communicat...
International audienceThe task graph of telecommunication applications often exhibits massive coarse...
In this paper we describe a compiler framework which can identify communication patterns for MPI-bas...
Current and future supercomputers have tens of thousands of compute nodes interconnected with high-d...