This article takes the first steps towards building a theory to explain how students interact with program visualizations when learning programming. First, we present the findings of a previous study we conducted to investigate how students voluntarily and regularly engaged with the program visualization tool VIP in a three-month programming course. Then, we interpret the empirical results of the study using Activity Theory. Finally, based on the interpretation, we propose two research hypotheses about students‘ long-term engagement with VIP. These hypotheses also set guidelines for future research on visualizations and teaching with visualizations, but most importantly, they offer a sustainable basis for further work on the theorization of...