Abstract
Deep metric based triplet loss has been widely used to enhance inter-class separability and intra-class compactness of network features. However, the margin parameters in the triplet loss for current approaches are usually fixed and not adaptive to the variations among different expression pairs. Meanwhile, outlier samples like faces with confusing expressions, occlusion and large head poses may be introduced during the selection of the hard triplets, which may deteriorate the generalization performance of the learned features for normal testing samples. In this work, a new triplet loss based on class-pair margins and multistage outlier suppression is proposed for facial expression recognition (FER). In this approach, each expression pair is assigned with an order-insensitive or two order-aware adaptive margin parameters. While expression samples with large head poses or occlusion are firstly detected and excluded, abnormal hard triplets are discarded if their feature distances do not fit the model of normal feature distance distribution. Extensive experiments on seven public benchmark expression databases show that the network using the proposed loss achieves much better accuracy than that using the original triplet loss and the network without using the proposed strategies, and the most balanced performances among state-of-the-art algorithms in the literature.
Original language | English |
---|---|
Pages (from-to) | 690-703 |
Number of pages | 14 |
Journal | IEEE Transactions on Circuits and Systems for Video Technology |
Volume | 32 |
Issue number | 2 |
DOIs | |
Publication status | Published - 1 Feb 2022 |
Externally published | Yes |
Keywords
- Adaptive class-pair margin
- Deep metric learning
- Facial expression recognition
- Hard triplet selection
- Multistage outlier suppression
ASJC Scopus subject areas
- Media Technology
- Electrical and Electronic Engineering