TY - GEN
T1 - Instrument-tissue Interaction Quintuple Detection in Surgery Videos
AU - Lin, Wenjun
AU - Hu, Yan
AU - Hao, Luoying
AU - Zhou, Dan
AU - Yang, Mingming
AU - Fu, Huazhu
AU - Chui, Cheekong
AU - Liu, Jiang
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Instrument-tissue interaction detection in surgical videos is a fundamental problem for surgical scene understanding which is of great significance to computer-assisted surgery. However, few works focus on this fine-grained surgical activity representation. In this paper, we propose to represent instrument-tissue interaction as ⟨ instrument bounding box, tissue bounding box, instrument class, tissue class, action class ⟩ quintuples. We present a novel quintuple detection network (QDNet) for the instrument-tissue interaction quintuple detection task in cataract surgery videos. Specifically, a spatiotemporal attention layer (STAL) is proposed to aggregate spatial and temporal information of the regions of interest between adjacent frames. We also propose a graph-based quintuple prediction layer (GQPL) to reason the relationship between instruments and tissues. Our method achieves an mAP of 42.24% on a cataract surgery video dataset, significantly outperforming other methods.
AB - Instrument-tissue interaction detection in surgical videos is a fundamental problem for surgical scene understanding which is of great significance to computer-assisted surgery. However, few works focus on this fine-grained surgical activity representation. In this paper, we propose to represent instrument-tissue interaction as ⟨ instrument bounding box, tissue bounding box, instrument class, tissue class, action class ⟩ quintuples. We present a novel quintuple detection network (QDNet) for the instrument-tissue interaction quintuple detection task in cataract surgery videos. Specifically, a spatiotemporal attention layer (STAL) is proposed to aggregate spatial and temporal information of the regions of interest between adjacent frames. We also propose a graph-based quintuple prediction layer (GQPL) to reason the relationship between instruments and tissues. Our method achieves an mAP of 42.24% on a cataract surgery video dataset, significantly outperforming other methods.
KW - Instrument-tissue interaction quintuple detection
KW - Surgery video
KW - Surgical scene understanding
UR - http://www.scopus.com/inward/record.url?scp=85139086379&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-16449-1_38
DO - 10.1007/978-3-031-16449-1_38
M3 - Conference contribution
AN - SCOPUS:85139086379
SN - 9783031164484
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 399
EP - 409
BT - Medical Image Computing and Computer Assisted Intervention – MICCAI 2022 - 25th International Conference, Proceedings
A2 - Wang, Linwei
A2 - Dou, Qi
A2 - Fletcher, P. Thomas
A2 - Speidel, Stefanie
A2 - Li, Shuo
PB - Springer Science and Business Media Deutschland GmbH
T2 - 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022
Y2 - 18 September 2022 through 22 September 2022
ER -