Authors: Iacovos Ioannou, CYENS;European University of Cyprus N Prabagarane, SSN Raspopoullos Marios, INSPIRE Research Centre; University of Central Lancashire; Larnaca; Cyprus Papadopoulou-Lesta Vicky, European University of Cyprus Christophorou Christophoros, CYENS;Uclan Cyprus Vassiliou Vasos, Cyprus;CYENS - Centre of Excellence; 1678 Nicosia
Reconfigurable intelligent surfaces (RIS) enable control of radio propagation via large arrays of passive reflecting elements. Optimizing RIS phase profiles for spectral efficiency is challenging due to high-dimensional continuous actions and non-convex channel coupling. We cast RIS beamforming as a sequential decision problem and evaluate four reinforcement-learning (RL) agents—A2C, Graph-Neural-Network Proximal Policy Optimization (GNN-PPO), Soft Actor–Critic (SAC), and Quantile-Regression PPO (QR-PPO)—in a realistic simulator with mobility, dual-slope log-distance path loss, shadowing, and Rician fading. Using a common protocol and PCA/GNN feature extraction, we compare agents on \textbf{rate} (mean and variability), \textbf{tail risk} via CVaR at 5\%, mean SNR, and wall-clock cost. \textbf{GNN-PPO} attains the best mean rate, the \emph{lowest} variability, the \emph{highest} CVaR at 5\% (strong tail performance), and the highest mean SNR. \textbf{A2C} is the compute-efficiency winner with the shortest total time, \textbf{SAC} provides a balanced compromise, while \textbf{QR-PPO} is cost-inefficient and underperforms in the tails under our configuration. We discuss design insights and directions for scalable, risk-aware RIS control.
Keywords: Reconfigurable intelligent surfaces,Reinforcement Learning,deep learning,wireless communications,6G,beamforming,GNN-PPO,SAC,CVaR
Published in: 2024 Asian Conference on Communication and Networks (ASIANComNet)
Date of Publication: --
DOI: -
Publisher: IEEE