Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection

Yi Wang; Ruili Wang; Xin Fan; Tianzhu Wang; Xiangjian He

doi:10.1109/CVPR52729.2023.00967

Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection

Yi Wang, Ruili Wang, Xin Fan, Tianzhu Wang, Xiangjian He

School of Computer Science

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

52 Citations (Scopus)

Abstract

Salient object detection (SOD) aims to mimic the human visual system (HVS) and cognition mechanisms to identify and segment salient objects. However, due to the complexity of these mechanisms, current methods are not perfect. Accuracy and robustness need to be further improved, particularly in complex scenes with multiple objects and background clutter. To address this issue, we propose a novel approach called Multiple Enhancement Network (MENet) that adopts the boundary sensibility, content integrity, iterative refinement, and frequency decomposition mechanisms of HVS. A multi-level hybrid loss is firstly designed to guide the network to learn pixel-level, region-level, and object-level features. A flexible multiscale feature enhancement module (ME-Module) is then designed to gradually aggregate and refine global or detailed features by changing the size order of the input feature sequence. An iterative training strategy is used to enhance boundary features and adaptive features in the dual-branch decoder of MENet. Comprehensive evaluations on six challenging benchmark datasets show that MENet achieves state-of-the-art results. Both the codes and results are publicly available at https://github.com/yiwangtz/MENet.

Original language	English
Title of host publication	Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
Publisher	IEEE Computer Society
Pages	10031-10040
Number of pages	10
ISBN (Electronic)	9798350301298
DOIs	https://doi.org/10.1109/CVPR52729.2023.00967
Publication status	Published - 2023
Event	2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023 - Vancouver, Canada Duration: 18 Jun 2023 → 22 Jun 2023

Publication series

Name	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume	2023-June
ISSN (Print)	1063-6919

Conference

Conference	2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
Country/Territory	Canada
City	Vancouver
Period	18/06/23 → 22/06/23

Keywords

Recognition: Categorization
detection
retrieval

ASJC Scopus subject areas

Software
Computer Vision and Pattern Recognition

Access to Document

10.1109/CVPR52729.2023.00967

Cite this

Wang, Y., Wang, R., Fan, X., Wang, T., & He, X. (2023). Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection. In Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023 (pp. 10031-10040). (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2023-June). IEEE Computer Society. https://doi.org/10.1109/CVPR52729.2023.00967

@inproceedings{55a4d26d92fa4a27a40a4925bdcbfa04,

title = "Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection",

abstract = "Salient object detection (SOD) aims to mimic the human visual system (HVS) and cognition mechanisms to identify and segment salient objects. However, due to the complexity of these mechanisms, current methods are not perfect. Accuracy and robustness need to be further improved, particularly in complex scenes with multiple objects and background clutter. To address this issue, we propose a novel approach called Multiple Enhancement Network (MENet) that adopts the boundary sensibility, content integrity, iterative refinement, and frequency decomposition mechanisms of HVS. A multi-level hybrid loss is firstly designed to guide the network to learn pixel-level, region-level, and object-level features. A flexible multiscale feature enhancement module (ME-Module) is then designed to gradually aggregate and refine global or detailed features by changing the size order of the input feature sequence. An iterative training strategy is used to enhance boundary features and adaptive features in the dual-branch decoder of MENet. Comprehensive evaluations on six challenging benchmark datasets show that MENet achieves state-of-the-art results. Both the codes and results are publicly available at https://github.com/yiwangtz/MENet.",

keywords = "Recognition: Categorization, detection, retrieval",

author = "Yi Wang and Ruili Wang and Xin Fan and Tianzhu Wang and Xiangjian He",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023 ; Conference date: 18-06-2023 Through 22-06-2023",

year = "2023",

doi = "10.1109/CVPR52729.2023.00967",

language = "English",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "10031--10040",

booktitle = "Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023",

address = "United States",

}

Wang, Y, Wang, R, Fan, X, Wang, T & He, X 2023, Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection. in Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2023-June, IEEE Computer Society, pp. 10031-10040, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, Canada, 18/06/23. https://doi.org/10.1109/CVPR52729.2023.00967

Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection. / Wang, Yi; Wang, Ruili; Fan, Xin et al.
Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023. IEEE Computer Society, 2023. p. 10031-10040 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2023-June).

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Pixels, Regions, and Objects

T2 - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023

AU - Wang, Yi

AU - Wang, Ruili

AU - Fan, Xin

AU - Wang, Tianzhu

AU - He, Xiangjian

PY - 2023

Y1 - 2023

N2 - Salient object detection (SOD) aims to mimic the human visual system (HVS) and cognition mechanisms to identify and segment salient objects. However, due to the complexity of these mechanisms, current methods are not perfect. Accuracy and robustness need to be further improved, particularly in complex scenes with multiple objects and background clutter. To address this issue, we propose a novel approach called Multiple Enhancement Network (MENet) that adopts the boundary sensibility, content integrity, iterative refinement, and frequency decomposition mechanisms of HVS. A multi-level hybrid loss is firstly designed to guide the network to learn pixel-level, region-level, and object-level features. A flexible multiscale feature enhancement module (ME-Module) is then designed to gradually aggregate and refine global or detailed features by changing the size order of the input feature sequence. An iterative training strategy is used to enhance boundary features and adaptive features in the dual-branch decoder of MENet. Comprehensive evaluations on six challenging benchmark datasets show that MENet achieves state-of-the-art results. Both the codes and results are publicly available at https://github.com/yiwangtz/MENet.

AB - Salient object detection (SOD) aims to mimic the human visual system (HVS) and cognition mechanisms to identify and segment salient objects. However, due to the complexity of these mechanisms, current methods are not perfect. Accuracy and robustness need to be further improved, particularly in complex scenes with multiple objects and background clutter. To address this issue, we propose a novel approach called Multiple Enhancement Network (MENet) that adopts the boundary sensibility, content integrity, iterative refinement, and frequency decomposition mechanisms of HVS. A multi-level hybrid loss is firstly designed to guide the network to learn pixel-level, region-level, and object-level features. A flexible multiscale feature enhancement module (ME-Module) is then designed to gradually aggregate and refine global or detailed features by changing the size order of the input feature sequence. An iterative training strategy is used to enhance boundary features and adaptive features in the dual-branch decoder of MENet. Comprehensive evaluations on six challenging benchmark datasets show that MENet achieves state-of-the-art results. Both the codes and results are publicly available at https://github.com/yiwangtz/MENet.

KW - Recognition: Categorization

KW - detection

KW - retrieval

UR - http://www.scopus.com/inward/record.url?scp=85169018976&partnerID=8YFLogxK

U2 - 10.1109/CVPR52729.2023.00967

DO - 10.1109/CVPR52729.2023.00967

M3 - Conference contribution

AN - SCOPUS:85169018976

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 10031

EP - 10040

BT - Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023

PB - IEEE Computer Society

Y2 - 18 June 2023 through 22 June 2023

ER -

Wang Y, Wang R, Fan X, Wang T, He X. Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection. In Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023. IEEE Computer Society. 2023. p. 10031-10040. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR52729.2023.00967

Pixels, Regions, and Objects: Multiple Enhancement for Salient Object Detection

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this