International audienceIn this paper we consider a measure-theoretical formulation of the training of NeurODEs in the form of a mean-field optimal control with $L^2$-regularization of the control. We derive first order optimality conditions for the NeurODE training problem in the form of a mean-field maximum principle, and show that it admits a unique control solution, which is Lipschitz continuous in time. As a consequence of this uniqueness property, the mean-field maximum principle also provides a strong quantitative generalization error for finite sample approximations, yielding a rigorous justification of a phenomenon that we call \textit{coupled descent}, indicating the simultaneous decrease of generalization and training errors. We co...