Shared-memory multi-processor/multi-core machines have become a reference for many application contexts. In particular, the recent literature on speculative parallel discrete event simulation has reshuffled the architectural organization of simulation systems in order to deeply exploit the main features of this type of machines. A core aspect dealt with has been the full sharing of the workload at the level of individual simulation events, which enables keeping the rollback incidence minimal. However, making each worker thread continuously switch its execution between events destined to different simulation objects does not favor locality. This problem appears even more evident in the case of Non-Uniform-Memory-Access (NUMA) machines, w...