Numerous studies have been published during the past two decades that use simulation models to assesscrop yield gaps (quantified as the difference between potential and actual farm yields), impact of climatechange on future crop yields, and land-use change. However, there is a wide range in quality and spatialand temporal scale and resolution of climate and soil data underpinning these studies, as well as widelydiffering assumptions about cropping-system context and crop model calibration. Here we present anexplicit rationale and methodology for selecting data sources for simulating crop yields and estimatingyield gaps at specific locations that can be applied across widely different levels of data availability andquality. The method consis...