TY - GEN
T1 - Real-time Architecture for Audio-Visual Active Speaker Detection
AU - Huang, Min
AU - Wang, Wen
AU - Lin, Zheyuan
AU - Tesema, Fiseha B.
AU - Ji, Shanshan
AU - Gu, Jason
AU - Wan, Minhong
AU - Song, Wei
AU - Li, Te
AU - Zhu, Shiqiang
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Continuously measuring the speaking state of users with robot in a human-robot Interaction(HRI) system improves metrics of interaction quality. Meanwhile, mainstream active speaker detection (ASD) algorithms emphasize achieving high AUCs at frame level in the AVA-Active Speaker dataset and pay less attention to get real-time performance in robotic systems. In this paper, we propose a model named FSDNet to keep a high AUC score in the AVA-Active Speaker dataset while reducing time cost, our model increase AUC score by 0.1% compared with the State-Of-The-Art and need only 75% running time. Furthermore, we put forward an architecture with a time-related prediction function to make our algorithm more effective and generative in interactive robotic systems. The code is released at https://github.com/huangmin9966/FSDNet-RealTimeArch.
AB - Continuously measuring the speaking state of users with robot in a human-robot Interaction(HRI) system improves metrics of interaction quality. Meanwhile, mainstream active speaker detection (ASD) algorithms emphasize achieving high AUCs at frame level in the AVA-Active Speaker dataset and pay less attention to get real-time performance in robotic systems. In this paper, we propose a model named FSDNet to keep a high AUC score in the AVA-Active Speaker dataset while reducing time cost, our model increase AUC score by 0.1% compared with the State-Of-The-Art and need only 75% running time. Furthermore, we put forward an architecture with a time-related prediction function to make our algorithm more effective and generative in interactive robotic systems. The code is released at https://github.com/huangmin9966/FSDNet-RealTimeArch.
UR - http://www.scopus.com/inward/record.url?scp=85147334252&partnerID=8YFLogxK
U2 - 10.1109/ROBIO55434.2022.10011692
DO - 10.1109/ROBIO55434.2022.10011692
M3 - Conference contribution
AN - SCOPUS:85147334252
T3 - 2022 IEEE International Conference on Robotics and Biomimetics, ROBIO 2022
SP - 1377
EP - 1382
BT - 2022 IEEE International Conference on Robotics and Biomimetics, ROBIO 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Conference on Robotics and Biomimetics, ROBIO 2022
Y2 - 5 December 2022 through 9 December 2022
ER -