TY - GEN
T1 - TCEIP
T2 - 26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023
AU - Yang, Xinquan
AU - Xie, Jinheng
AU - Li, Xuguang
AU - Li, Xuechen
AU - Li, Xin
AU - Shen, Linlin
AU - Deng, Yongqiang
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.
PY - 2023
Y1 - 2023
N2 - When deep neural network has been proposed to assist the dentist in designing the location of dental implant, most of them are targeting simple cases where only one missing tooth is available. As a result, literature works do not work well when there are multiple missing teeth and easily generate false predictions when the teeth are sparsely distributed. In this paper, we are trying to integrate a weak supervision text, the target region, to the implant position regression network, to address above issues. We propose a text condition embedded implant position regression network (TCEIP), to embed the text condition into the encoder-decoder framework for improvement of the regression performance. A cross-modal interaction that consists of cross-modal attention (CMA) and knowledge alignment module (KAM) is proposed to facilitate the interaction between features of images and texts. The CMA module performs a cross-attention between the image feature and the text condition, and the KAM mitigates the knowledge gap between the image feature and the image encoder of the CLIP. Extensive experiments on a dental implant dataset through five-fold cross-validation demonstrated that the proposed TCEIP achieves superior performance than existing methods.
AB - When deep neural network has been proposed to assist the dentist in designing the location of dental implant, most of them are targeting simple cases where only one missing tooth is available. As a result, literature works do not work well when there are multiple missing teeth and easily generate false predictions when the teeth are sparsely distributed. In this paper, we are trying to integrate a weak supervision text, the target region, to the implant position regression network, to address above issues. We propose a text condition embedded implant position regression network (TCEIP), to embed the text condition into the encoder-decoder framework for improvement of the regression performance. A cross-modal interaction that consists of cross-modal attention (CMA) and knowledge alignment module (KAM) is proposed to facilitate the interaction between features of images and texts. The CMA module performs a cross-attention between the image feature and the text condition, and the KAM mitigates the knowledge gap between the image feature and the image encoder of the CLIP. Extensive experiments on a dental implant dataset through five-fold cross-validation demonstrated that the proposed TCEIP achieves superior performance than existing methods.
KW - Cross-Modal Interaction
KW - Deep Learning
KW - Dental Implant
KW - Text Guided Detection
UR - http://www.scopus.com/inward/record.url?scp=85174679731&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-43987-2_31
DO - 10.1007/978-3-031-43987-2_31
M3 - Conference contribution
AN - SCOPUS:85174679731
SN - 9783031439865
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 317
EP - 326
BT - Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings
A2 - Greenspan, Hayit
A2 - Greenspan, Hayit
A2 - Madabhushi, Anant
A2 - Mousavi, Parvin
A2 - Salcudean, Septimiu
A2 - Duncan, James
A2 - Syeda-Mahmood, Tanveer
A2 - Taylor, Russell
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 8 October 2023 through 12 October 2023
ER -