In this paper, we explore mid-level image representations for real-time heart view plane classification of 2D echocardiogram ultrasound images. The proposed representations rely on bags of visual words, successfully used by the computer vision community in visual recognition problems. An important element of the proposed representations is the image sampling with large regions, drastically reducing the execution time of the image characterization procedure. Throughout an extensive set of experiments, we evaluate the proposed approach against different image descriptors for classifying four heart view planes. The results show that our approach is effective and efficient for the target problem, making it suitable for use in real-time setups. ...