Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond

Chin Boon Chng; Wenjun Lin; Yaxin Hu; Yan Hu; Jiang Liu; Chee Kong Chui

doi:10.1145/3628797.3628999

Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond

Chin Boon Chng, Wenjun Lin, Yaxin Hu, Yan Hu, Jiang Liu, Chee Kong Chui

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

Abstract

With the continuous development of intelligent operating room systems, the segmentation and automatic recognition of surgical workflow have become challenging research fields. In recent years, an increasing number of models have been proposed to address this challenge, with deep learning becoming the mainstream approach. In this paper, we propose a multi-stage network for surgical step recognition by using surgical video and kinematic data. Firstly, a convolutional neural network (ResNet34) is used to extract visual features from video frames. Next, since surgical videos are a form of sequential data, a Temporal Convolutional Network (TCN) is employed as a temporal extractor to process temporal information between video frames for classification. Finally, a multi-stage TCN network, consisting of Encoder-Decoded TCN and Dilated TCN architectures, is used to refine the result. The proposed network is compared against a LSTM network from our prior work and is evaluated on a surgical dataset named MISAW in two modes - video data with and without kinematic data. Experimental results indicate that kinematic data is crucial for robot motion control in the operating rooms of the future. The technology will also find application in robotic labs for the development and optimization of chemical manufacturing processes.

Original language	English
Title of host publication	SOICT 2023 - 12th International Symposium on Information and Communication Technology
Publisher	Association for Computing Machinery
Pages	599-606
Number of pages	8
ISBN (Electronic)	9798400708916
DOIs	https://doi.org/10.1145/3628797.3628999
Publication status	Published - 7 Dec 2023
Externally published	Yes
Event	12th International Symposium on Information and Communication Technology, SOICT 2023 - Ho Chi Minh City, Viet Nam Duration: 7 Dec 2023 → 8 Dec 2023

Publication series

Name	ACM International Conference Proceeding Series

Conference

Conference	12th International Symposium on Information and Communication Technology, SOICT 2023
Country/Territory	Viet Nam
City	Ho Chi Minh City
Period	7/12/23 → 8/12/23

Keywords

Automation
Intelligent Operating Room
Multi-stage Model
Step Recognition
Surgical Robotics
Temporal Convolutional Networks

ASJC Scopus subject areas

Human-Computer Interaction
Computer Networks and Communications
Computer Vision and Pattern Recognition
Software

Access to Document

10.1145/3628797.3628999

Cite this

Chng, C. B., Lin, W., Hu, Y., Hu, Y., Liu, J., & Chui, C. K. (2023). Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond. In SOICT 2023 - 12th International Symposium on Information and Communication Technology (pp. 599-606). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3628797.3628999

@inproceedings{2701f9545c514e409a38aea4e4dece82,

title = "Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond",

abstract = "With the continuous development of intelligent operating room systems, the segmentation and automatic recognition of surgical workflow have become challenging research fields. In recent years, an increasing number of models have been proposed to address this challenge, with deep learning becoming the mainstream approach. In this paper, we propose a multi-stage network for surgical step recognition by using surgical video and kinematic data. Firstly, a convolutional neural network (ResNet34) is used to extract visual features from video frames. Next, since surgical videos are a form of sequential data, a Temporal Convolutional Network (TCN) is employed as a temporal extractor to process temporal information between video frames for classification. Finally, a multi-stage TCN network, consisting of Encoder-Decoded TCN and Dilated TCN architectures, is used to refine the result. The proposed network is compared against a LSTM network from our prior work and is evaluated on a surgical dataset named MISAW in two modes - video data with and without kinematic data. Experimental results indicate that kinematic data is crucial for robot motion control in the operating rooms of the future. The technology will also find application in robotic labs for the development and optimization of chemical manufacturing processes.",

keywords = "Automation, Intelligent Operating Room, Multi-stage Model, Step Recognition, Surgical Robotics, Temporal Convolutional Networks",

author = "Chng, {Chin Boon} and Wenjun Lin and Yaxin Hu and Yan Hu and Jiang Liu and Chui, {Chee Kong}",

note = "Publisher Copyright: {\textcopyright} 2023 Owner/Author.; 12th International Symposium on Information and Communication Technology, SOICT 2023 ; Conference date: 07-12-2023 Through 08-12-2023",

year = "2023",

month = dec,

day = "7",

doi = "10.1145/3628797.3628999",

language = "English",

series = "ACM International Conference Proceeding Series",

publisher = "Association for Computing Machinery",

pages = "599--606",

booktitle = "SOICT 2023 - 12th International Symposium on Information and Communication Technology",

}

Chng, CB, Lin, W, Hu, Y, Hu, Y, Liu, J & Chui, CK 2023, Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond. in SOICT 2023 - 12th International Symposium on Information and Communication Technology. ACM International Conference Proceeding Series, Association for Computing Machinery, pp. 599-606, 12th International Symposium on Information and Communication Technology, SOICT 2023, Ho Chi Minh City, Viet Nam, 7/12/23. https://doi.org/10.1145/3628797.3628999

Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond. / Chng, Chin Boon; Lin, Wenjun; Hu, Yaxin et al.
SOICT 2023 - 12th International Symposium on Information and Communication Technology. Association for Computing Machinery, 2023. p. 599-606 (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond

AU - Chng, Chin Boon

AU - Lin, Wenjun

AU - Hu, Yaxin

AU - Hu, Yan

AU - Liu, Jiang

AU - Chui, Chee Kong

PY - 2023/12/7

Y1 - 2023/12/7

N2 - With the continuous development of intelligent operating room systems, the segmentation and automatic recognition of surgical workflow have become challenging research fields. In recent years, an increasing number of models have been proposed to address this challenge, with deep learning becoming the mainstream approach. In this paper, we propose a multi-stage network for surgical step recognition by using surgical video and kinematic data. Firstly, a convolutional neural network (ResNet34) is used to extract visual features from video frames. Next, since surgical videos are a form of sequential data, a Temporal Convolutional Network (TCN) is employed as a temporal extractor to process temporal information between video frames for classification. Finally, a multi-stage TCN network, consisting of Encoder-Decoded TCN and Dilated TCN architectures, is used to refine the result. The proposed network is compared against a LSTM network from our prior work and is evaluated on a surgical dataset named MISAW in two modes - video data with and without kinematic data. Experimental results indicate that kinematic data is crucial for robot motion control in the operating rooms of the future. The technology will also find application in robotic labs for the development and optimization of chemical manufacturing processes.

AB - With the continuous development of intelligent operating room systems, the segmentation and automatic recognition of surgical workflow have become challenging research fields. In recent years, an increasing number of models have been proposed to address this challenge, with deep learning becoming the mainstream approach. In this paper, we propose a multi-stage network for surgical step recognition by using surgical video and kinematic data. Firstly, a convolutional neural network (ResNet34) is used to extract visual features from video frames. Next, since surgical videos are a form of sequential data, a Temporal Convolutional Network (TCN) is employed as a temporal extractor to process temporal information between video frames for classification. Finally, a multi-stage TCN network, consisting of Encoder-Decoded TCN and Dilated TCN architectures, is used to refine the result. The proposed network is compared against a LSTM network from our prior work and is evaluated on a surgical dataset named MISAW in two modes - video data with and without kinematic data. Experimental results indicate that kinematic data is crucial for robot motion control in the operating rooms of the future. The technology will also find application in robotic labs for the development and optimization of chemical manufacturing processes.

KW - Automation

KW - Intelligent Operating Room

KW - Multi-stage Model

KW - Step Recognition

KW - Surgical Robotics

KW - Temporal Convolutional Networks

UR - http://www.scopus.com/inward/record.url?scp=85180550841&partnerID=8YFLogxK

U2 - 10.1145/3628797.3628999

DO - 10.1145/3628797.3628999

M3 - Conference contribution

AN - SCOPUS:85180550841

T3 - ACM International Conference Proceeding Series

SP - 599

EP - 606

BT - SOICT 2023 - 12th International Symposium on Information and Communication Technology

PB - Association for Computing Machinery

T2 - 12th International Symposium on Information and Communication Technology, SOICT 2023

Y2 - 7 December 2023 through 8 December 2023

ER -

Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this