Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond

Chin Boon Chng, Wenjun Lin, Yaxin Hu, Yan Hu, Jiang Liu, Chee Kong Chui

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review


With the continuous development of intelligent operating room systems, the segmentation and automatic recognition of surgical workflow have become challenging research fields. In recent years, an increasing number of models have been proposed to address this challenge, with deep learning becoming the mainstream approach. In this paper, we propose a multi-stage network for surgical step recognition by using surgical video and kinematic data. Firstly, a convolutional neural network (ResNet34) is used to extract visual features from video frames. Next, since surgical videos are a form of sequential data, a Temporal Convolutional Network (TCN) is employed as a temporal extractor to process temporal information between video frames for classification. Finally, a multi-stage TCN network, consisting of Encoder-Decoded TCN and Dilated TCN architectures, is used to refine the result. The proposed network is compared against a LSTM network from our prior work and is evaluated on a surgical dataset named MISAW in two modes - video data with and without kinematic data. Experimental results indicate that kinematic data is crucial for robot motion control in the operating rooms of the future. The technology will also find application in robotic labs for the development and optimization of chemical manufacturing processes.

Original languageEnglish
Title of host publicationSOICT 2023 - 12th International Symposium on Information and Communication Technology
PublisherAssociation for Computing Machinery
Number of pages8
ISBN (Electronic)9798400708916
Publication statusPublished - 7 Dec 2023
Externally publishedYes
Event12th International Symposium on Information and Communication Technology, SOICT 2023 - Ho Chi Minh City, Viet Nam
Duration: 7 Dec 20238 Dec 2023

Publication series

NameACM International Conference Proceeding Series


Conference12th International Symposium on Information and Communication Technology, SOICT 2023
Country/TerritoryViet Nam
CityHo Chi Minh City


  • Automation
  • Intelligent Operating Room
  • Multi-stage Model
  • Step Recognition
  • Surgical Robotics
  • Temporal Convolutional Networks

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software


Dive into the research topics of 'Automatic Step Recognition with Video and Kinematic Data for Intelligent Operating Room and Beyond'. Together they form a unique fingerprint.

Cite this