SOURCE IEEE Trans. Cognitive Commun. Networking, 2021, 7(4):1430-1443
Published Date: July 2021
In future wireless systems, latency of information needs to be minimized to satisfy the requirements of many mission-critical applications. Meanwhile, not all terminals carry equally-urgent packets given their distinct situations, e.g., status freshness. Leveraging this feature, we propose an on-demand Medium Access Control (MAC) scheme, whereby each terminal transmits with dynamically adjusted aggressiveness based on its situations which are modeled as Markov states. A Multi-Agent Reinforcement Learning (MARL) framework is utilized and each agent is trained with a Deep Deterministic Policy Gradient (DDPG) network. A notorious issue for MARL is slow and nonscalable convergence – to address this, a new Situationally-aware MARL-based Transmissions (SMART) scheme is proposed. It is shown that SMART can significantly shorten the convergence time and the converged performance is also dramatically improved compared with state-of-the-art DDPG-based MARL schemes, at the expense of an additional offline training stage.
SMART also outperforms conventional MAC schemes significantly, e.g., Carrier Sensing and Multiple Access (CSMA), in terms of average and peak Age of Information (AoI). In addition, SMART also has the advantage of versatility – different Quality-of-Service (QoS) metrics and hence various state space definitions are tested in extensive simulations, where SMART shows robustness and scalability in all considered scenarios.