With the ubiquitous growth of Internet-of-things (IoT) devices, current low-power wide-area network (LPWAN) technologies will inevitably face performance degradation due to congestion and interference. The rule-based approaches to assign and adapt the device parameters are insufficient in dynamic massive IoT scenarios. For example, the adaptive data rate (ADR) algorithm in LoRaWAN has been proven inefficient and outdated for large-scale IoT networks. Meanwhile, new solutions involving machine learning (ML) and reinforcement learning (RL) techniques are shown to be very effective in solving resource allocation in dense IoT networks. In this article, we propose a new concept of using two independent learning approaches for allocating spreadin...