Building on Item Response Theory we introduce students’ optimal behavior in multiple-choice tests. Our simulations indicate that the optimal penalty is relatively high, because although correction for guessing discriminates against risk-averse subjects, this effect is small compared with the measurement error that the penalty prevents. This result obtains when knowledge is binary or partial, under different normalizations of the score, when risk aversion is related to knowledge and when there is a pass-fail break point. We also find that the mean degree of difficulty should be close to the mean level of knowledge and that the variance of difficulty should be high.Financial support from Spanish Ministry of Education and Science (grant SEJ200...