\u3cp\u3eWith the extraordinary growth of cores and threads in to-day's multithreaded machines, analyzing and tuning the performance of such platforms becomes a challenging task. In this paper, we propose an intuitive and visualizable model for analyzing the performance of contemporary highly con-current multithreaded machines. Based on ow balancing between service demand and service supply of the memory system, the model draws an intuitive figure to characterize machine state, identify bottlenecks and determine optimization directions. The tractability of the model is highlighted as it only requires two parameters from the workload. Our model achieves 90% and 83% prediction accuracy for computation throughput on Fermi and Kepler GPUs over ...