Data augmentation by morphological mixup for solving Raven’s progressive matrices

Research output: Journal PublicationArticlepeer-review

Abstract

Raven’s progressive matrix (RPM) is one kind of visual abstract reasoning tasks, which tests the ability of extracting reasoning rules from limited samples and applying them to an unknown setting. It is frequently used in evaluating human intelligence. Recent advances of RPM-like datasets and solution models partially address the challenges of visually understanding the RPM questions and logically reasoning the missing answers. This paper tackles the challenges of the poor generalization performance due to insufficient samples in RPM datasets. To address the problem of insufficient data for precisely conducting relational reasoning in RPMs, we propose an effective scheme, namely candidate answer morphological mixup (CAM-Mix). CAM-Mix serves as a data augmentation strategy by gray-scale image morphological mixup, which regularizes various solution methods and overcomes the model overfitting problem. Compared with existing methods, a more accurate decision boundary could be defined by creating new negative candidate answers semantically similar to the correct answers. Experimental results show that the proposed data augmentation method on state-of-the-art RPM solution models can provide significant and consistent performance improvements on various RPM-like datasets compared with state-of-the-art solution models and other data augmentation strategies.

Original languageEnglish
Pages (from-to)2457-2470
Number of pages14
JournalVisual Computer
Volume40
Issue number4
DOIs
Publication statusPublished - Apr 2024

Keywords

  • Data augmentation
  • Image mixup
  • Raven’s progressive matrices
  • Visual analogical reasoning

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'Data augmentation by morphological mixup for solving Raven’s progressive matrices'. Together they form a unique fingerprint.

Cite this