A new decentralised demand response (DR) model relying on bi-directional communications is developed in this study. In this model, each user is considered as an agent that submits its bids according to the consumption urgency and a set of parameters defined by a reinforcement learning algorithm called Q-learning. The bids are sent to a local DR market, which is responsible for communicating all bids to the wholesale market and the system operator (SO), reporting to the customers after determining the local DR market clearing price. From local markets’ viewpoint, the goal is to maximise social welfare. Four DR levels are considered to evaluate the effect of different DR portions in the cost of the electricity purchase. The outcomes are compa...