TY - GEN
T1 - DRL-Aided Joint Resource Block and Beamforming Management for Cellular-Connected UAVs
AU - Li, Yuanjian
AU - Sellathurai, Mathini
AU - Chu, Zheng
AU - Xiao, Pei
AU - Aghvami, A. Hamid
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In this paper, we investigate a cellular-connected unmanned aerial vehicle (UAV) network, where multiple UAVs receive messages from base stations (BSs) in the down-link, and in the meantime, BSs serve their paired ground user equipments (UEs). To effectively manage inter-cell interferences (ICIs) among UEs due to intense reuse of time-frequency resource block (RB) resource, a first p-tier based RB coordination criterion is adopted. Then, to enhance wireless transmission quality for UAVs while protecting terrestrial UEs from being interfered by ground-to-air (G2A) transmissions, a radio resource management (RRM) problem of joint dynamic RB coordination and time-varying beamforming design is formulated to minimize UAV's ergodic outage duration (EOD). To cope with conventional optimization techniques' inefficiency in solving the formulated RRM problem, a deep reinforcement learning (DRL)-aided solution is proposed, where deep double duelling Q network (D3QN) and twin delayed deep deterministic policy gradient (TD3) are invoked to deal with RB coordination in the discrete action domain and beamforming design in the continuous action regime, respectively. Numerical results illustrate the effectiveness of the proposed hybrid D3QNTD3 algorithm, compared to representative baselines.
AB - In this paper, we investigate a cellular-connected unmanned aerial vehicle (UAV) network, where multiple UAVs receive messages from base stations (BSs) in the down-link, and in the meantime, BSs serve their paired ground user equipments (UEs). To effectively manage inter-cell interferences (ICIs) among UEs due to intense reuse of time-frequency resource block (RB) resource, a first p-tier based RB coordination criterion is adopted. Then, to enhance wireless transmission quality for UAVs while protecting terrestrial UEs from being interfered by ground-to-air (G2A) transmissions, a radio resource management (RRM) problem of joint dynamic RB coordination and time-varying beamforming design is formulated to minimize UAV's ergodic outage duration (EOD). To cope with conventional optimization techniques' inefficiency in solving the formulated RRM problem, a deep reinforcement learning (DRL)-aided solution is proposed, where deep double duelling Q network (D3QN) and twin delayed deep deterministic policy gradient (TD3) are invoked to deal with RB coordination in the discrete action domain and beamforming design in the continuous action regime, respectively. Numerical results illustrate the effectiveness of the proposed hybrid D3QNTD3 algorithm, compared to representative baselines.
UR - http://www.scopus.com/inward/record.url?scp=85187388016&partnerID=8YFLogxK
U2 - 10.1109/GLOBECOM54140.2023.10437176
DO - 10.1109/GLOBECOM54140.2023.10437176
M3 - Conference contribution
AN - SCOPUS:85187388016
T3 - Proceedings - IEEE Global Communications Conference, GLOBECOM
SP - 3045
EP - 3050
BT - GLOBECOM 2023 - 2023 IEEE Global Communications Conference
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE Global Communications Conference, GLOBECOM 2023
Y2 - 4 December 2023 through 8 December 2023
ER -