Exploration-Enhanced Dueling Double Deep Q-Network with Random Network Distillation for Satellite Beam Selection

Authors: Zijing Cheng, Chuxiong Sun, Shuaijun Liu, and Lixiang Liu
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 1153-1170
Keywords: Multi-Beam Satellite、Deep Reinforcement Learning、Random Network Distillation 、Beam Selection.

Abstract

With the evolution of satellite communication systems towards achieving low-latency and high-throughput performance, dynamic beam resource scheduling emerges as a challenging sequential decision-making task that can be effectively tackled using deep reinforcement learning DRL . However, owing to the sparse channel characteristics and complex multi-user interference in satellite communications, traditional DRL methods struggle to obtain effective learning signals during exploration, resulting in suboptimal resource allocation efficiency. To address this challenge, in this work, we propose Beam selection with Integrated RND BIRD , a novel framework that combines the Dueling Double Deep Q-Network DQN architecture with Random Network Distillation RND to enhance exploration capabilities in sparse state spaces. Our main innovations include the design of an enhanced solution framework that integrates Dueling DQN-based value evaluation architecture with RND mechanism to improve exploration efficiency through intrinsic rewards. Additionally, we develop a novel Markov Decision Process MDP model for formalizing the beam selection as a sequential decision problem. Simulation results demonstrate that BIRD achieves a significant 24.1 improvement in system sum rate compared to traditional beam selection methods.
📄 View Full Paper (PDF) 📋 Show Citation