Abstract—We examine the Xeon Phi, which is based on Intel’s Many Integrated Cores architecture, for its suitability to run the FDK algorithm—the most commonly used algorithm to perform the 3D image reconstruction in cone-beam computed tomography. We study the challenges of efficiently parallelizing the application and means to enable sensible data sharing between threads despite the lack of a shared last level cache. Apart from paral-lelization, SIMD vectorization is critical for good performance on the Xeon Phi; we perform various micro-benchmarks to investigate the platform’s new set of vector instructions and put a special emphasis on the newly introduced vector gather capability. We refine a previous performance model for the applicatio...
Graph500 is a data intensive application for high performance computing and it is an increasingly im...
Hardware accelerators are currently becoming increasingly important in boosting high performance com...
Recently there has been a lot of interest in improving the infrastructure used in medical applicatio...
The computational effort of 3D image reconstruction in Computed Tomography (CT) has required special...
Efficiently exploiting SIMD vector units is one of the most important aspects in achieving high perf...
Abstract—We investigate and characterize the performance of an important class of operations on GPUs...
Single Instruction, Multiple Data (SIMD) vectorization is a major driver of performance in current a...
Intel's Xeon Phi combines the parallel processing power of a many-core accelerator with the programm...
This thesis is dedicated to the implementation of high performance algorithms on the Intel Xeon Phi ...
Nowadays, the simulation of ultrasound acoustic waves has a wide range of practical usage. As one of...
The paper presents speed up of the k-means algorithm for image segmentation. This speed up is achiev...
With the increasing size and complexity of data produced by large scale numerical simulations, it is...
MapReduce has become one of the most popular framework for building big-data applications. It was or...
Abstract—With the ease-of-programming, flexibility and yet effi-ciency, MapReduce has become one of ...
At the LHC, particles are collided in order to understand how the universe was created. Those collis...
Graph500 is a data intensive application for high performance computing and it is an increasingly im...
Hardware accelerators are currently becoming increasingly important in boosting high performance com...
Recently there has been a lot of interest in improving the infrastructure used in medical applicatio...
The computational effort of 3D image reconstruction in Computed Tomography (CT) has required special...
Efficiently exploiting SIMD vector units is one of the most important aspects in achieving high perf...
Abstract—We investigate and characterize the performance of an important class of operations on GPUs...
Single Instruction, Multiple Data (SIMD) vectorization is a major driver of performance in current a...
Intel's Xeon Phi combines the parallel processing power of a many-core accelerator with the programm...
This thesis is dedicated to the implementation of high performance algorithms on the Intel Xeon Phi ...
Nowadays, the simulation of ultrasound acoustic waves has a wide range of practical usage. As one of...
The paper presents speed up of the k-means algorithm for image segmentation. This speed up is achiev...
With the increasing size and complexity of data produced by large scale numerical simulations, it is...
MapReduce has become one of the most popular framework for building big-data applications. It was or...
Abstract—With the ease-of-programming, flexibility and yet effi-ciency, MapReduce has become one of ...
At the LHC, particles are collided in order to understand how the universe was created. Those collis...
Graph500 is a data intensive application for high performance computing and it is an increasingly im...
Hardware accelerators are currently becoming increasingly important in boosting high performance com...
Recently there has been a lot of interest in improving the infrastructure used in medical applicatio...