There is growing interest in autonomic, self-tuning software that can optimize itself on new platforms, without manual intervention. Optimization requires detailed knowledge of the target platform such as the latency and throughput of instructions, the numbers of registers, and the organization of the memory hierarchy. An autonomic optimization system needs to determine such platform-specific information on its own. In this paper, we describe the design and implementation of X-Ray, which is a tool that automatically measures a large number of such platform-specific parameters. For some of these parameters, we also describe novel algorithms, which are more robust than existing ones. X-Ray is written in C for maximum portability, and it is b...
The gap between peak and delivered performance for scientific applications running on microprocesso...
In the process of hardware optimization, physical queries requiring laboratory experiments are often...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
There is growing interest in autonomic, self-tuning software that can optimize itself on new platfor...
There is growing interest in self-optimizing computing systems that can optimize their own behavior ...
Embedded processor designs are increasingly based on general-purpose processor families, modified a...
Abstract. The increasing complexity of computer architectures has made the approach of automatically...
The risks of X-ray exposure to the human body are well documented and its link to cancer proven. How...
The advent of next-generation X-ray free electron lasers will be capable of delivering X-rays at a r...
Computers perform different applications in different ways. To characterize an application performan...
Abstract. In this paper an attempt was made to identify optimal hardware configuration for a worksta...
On modern computers, the running time of many applications is dominated by the cost of memory opera...
DARPA’s AACE project aimed to develop Architecture Aware Com-piler Environments that automatically c...
The gap between peak and delivered performance for scientific applications running on microprocessor...
An autotuner takes a parameterized code as input and tries to optimize the code by finding the best ...
The gap between peak and delivered performance for scientific applications running on microprocesso...
In the process of hardware optimization, physical queries requiring laboratory experiments are often...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...
There is growing interest in autonomic, self-tuning software that can optimize itself on new platfor...
There is growing interest in self-optimizing computing systems that can optimize their own behavior ...
Embedded processor designs are increasingly based on general-purpose processor families, modified a...
Abstract. The increasing complexity of computer architectures has made the approach of automatically...
The risks of X-ray exposure to the human body are well documented and its link to cancer proven. How...
The advent of next-generation X-ray free electron lasers will be capable of delivering X-rays at a r...
Computers perform different applications in different ways. To characterize an application performan...
Abstract. In this paper an attempt was made to identify optimal hardware configuration for a worksta...
On modern computers, the running time of many applications is dominated by the cost of memory opera...
DARPA’s AACE project aimed to develop Architecture Aware Com-piler Environments that automatically c...
The gap between peak and delivered performance for scientific applications running on microprocessor...
An autotuner takes a parameterized code as input and tries to optimize the code by finding the best ...
The gap between peak and delivered performance for scientific applications running on microprocesso...
In the process of hardware optimization, physical queries requiring laboratory experiments are often...
We present accurate, low-level measurements of process preemption, interrupt handling and memory sys...