This thesis work is funded by the ANR PetaQCD project. We have mainly worked on two topics of GPU performance analysis. We have designed an approach which is simple enough for developers to use and can provide more insight into the performance results. And we have designed an approach to estimate the performance upper bound of an application on GPUs and guide the performance optimization. First part of the thesis work was presented at Rapido '12 workshop. We have de- veloped an analytical method and a timing estimation tool (TEG) to predict CUDA application's performance for GT200 generation GPU. TEG passes GPU kernels' as- sembly code and collects information including instruction type, operands, etc. Then TEG can predict GPU applications'...