We present a framework for the gradual improvement of model-based controllers. The total time of the learning procedure is divided into a number of learning intervals. After a learning interval, the model is refined based on the measured data. This model is used to synthesize the controller that will be applied during the next learning interval. Excitation signals can be injected into the control loop during each of the learning intervals. On the one hand, the introduction of an excitation signal worsens the control performance during the current learning interval since it acts as a disturbance. On the other hand, the informative data generated owing to the excitation signal are used to refine the model using a closed-loop system identifica...