TY - GEN
T1 - Noise Invariant Frame Selection
T2 - 2018 International Joint Conference on Neural Networks, IJCNN 2018
AU - Song, Siyang
AU - Zhang, Shuimei
AU - Schuller, Bjorn W.
AU - Shen, Linlin
AU - Valstar, Michel
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/10
Y1 - 2018/10/10
N2 - The performance of speaker-related systems usually degrades heavily in practical applications largely due to the presence of background noise. To improve the robustness of such systems in unknown noisy environments, this paper proposes a simple pre-processing method called Noise Invariant Frame Selection (NIFS). Based on several noisy constraints, it selects noise invariant frames from utterances to represent speakers. Experiments conducted on the TIMIT database showed that the NIFS can significantly improve the performance of Vector Quantization (VQ), Gaussian Mixture Model-Universal Background Model (GMM-UBM) and i-vector-based speaker verification systems in different unknown noisy environments with different SNRs, in comparison to their baselines. Meanwhile, the proposed NIFS-based speaker verification systems achieves similar performance when we change the constraints (hyper-parameters) or features, which indicates that it is robust and easy to reproduce. Since NIFS is designed as a general algorithm, it could be further applied to other similar tasks.
AB - The performance of speaker-related systems usually degrades heavily in practical applications largely due to the presence of background noise. To improve the robustness of such systems in unknown noisy environments, this paper proposes a simple pre-processing method called Noise Invariant Frame Selection (NIFS). Based on several noisy constraints, it selects noise invariant frames from utterances to represent speakers. Experiments conducted on the TIMIT database showed that the NIFS can significantly improve the performance of Vector Quantization (VQ), Gaussian Mixture Model-Universal Background Model (GMM-UBM) and i-vector-based speaker verification systems in different unknown noisy environments with different SNRs, in comparison to their baselines. Meanwhile, the proposed NIFS-based speaker verification systems achieves similar performance when we change the constraints (hyper-parameters) or features, which indicates that it is robust and easy to reproduce. Since NIFS is designed as a general algorithm, it could be further applied to other similar tasks.
UR - http://www.scopus.com/inward/record.url?scp=85056529382&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2018.8489497
DO - 10.1109/IJCNN.2018.8489497
M3 - Conference contribution
AN - SCOPUS:85056529382
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2018 International Joint Conference on Neural Networks, IJCNN 2018 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 8 July 2018 through 13 July 2018
ER -