Effective and automated verification techniques able to provide assurances of performance and scalability are highly demanded in the context of microservices systems. In this paper, we introduce a methodology that applies specification-driven load testing to learn the behavior of the target microservices system under multiple deployment configurations. Testing is driven by realistic workload conditions sampled in production. The sampling produces a formal description of the users' behavior through a Discrete Time Markov Chain. This model drives multiple load testing sessions that query the system under test and feed a Bayesian inference process which incrementally refines the initial model to obtain a complete specification from run-time ev...