Evaluating the models we use in prediction is important as it allows us to identify uncertainties in prediction as well as guiding the priorities for model development. This paper describes a set of benchmark tests that is designed to quantify the performance of the land surface model that is used in the UK Hadley Centre General Circulation Model (JULES: Joint UK Land Environment Simulator). The tests are designed to assess the ability of the model to reproduce the observed fluxes of water and carbon at the global and regional spatial scale, and on a seasonal basis. Five datasets are used to test the model: water and carbon dioxide fluxes from ten FLUXNET sites covering the major global biomes, atmospheric carbon dioxide concentrations at f...