Abstract
Three-dimensional (3D) instance segmentation, as an effective method for scene object recognition, can effectively enhance workshop intelligence. However, the existing 3D instance segmentation network is difficult to apply in workshop scenes due to the large number of 3D instance segmentation labels and 3D information acquisition devices required. In this paper, a monocular 3D instance segmentation network is proposed, which achieves satisfactory results while relying on a monocular RGB camera without 3D instance segmentation labels. The proposed method has two stages. In the first stage, the double-snake multitask network is proposed to solve the problem of a lack of 3D information acquisition devices. It simultaneously performs depth estimation and instance segmentation and uses the features obtained from depth estimation to guide the instance segmentation task. In the second stage, an adaptive point cloud filtering algorithm that performs adaptive point cloud noise filtering on multi-scale objects based on two-dimensional prior information is proposed to solve the problem of a lack of 3D labels. In addition, color information is introduced into the filtering process to further improve filtering accuracy. Experiments on the Cityscapes and SOP datasets demonstrate the competitive performance of the proposed method. When the IoU threshold is set to 0.35, the mean average precision (mAP) is 50.41. Our approach is deployed in an actual production workshop to verify its feasibility.
Original language | English |
---|---|
Pages (from-to) | 3273-3289 |
Number of pages | 17 |
Journal | Journal of Intelligent Manufacturing |
Volume | 35 |
Issue number | 7 |
DOIs | |
Publication status | Published - Oct 2024 |
Externally published | Yes |
Keywords
- 3D instance segmentation
- Depth estimation
- Multitask learning
- Production workshop
ASJC Scopus subject areas
- Software
- Industrial and Manufacturing Engineering
- Artificial Intelligence