TY - GEN
T1 - Hidden Markov model for hard-drive failure detection
AU - Teoh, Teik Toe
AU - Cho, Siu Yeung
AU - Nguwi, Yok Yen
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2012
Y1 - 2012
N2 - This paper illustrates the use of Hidden Markov Model (HMM) to model hard disk failure. The reason we use HMM is because HMM is a formal foundation for making probabilistic models of linear sequence 'labeling' problem. We use the database provided by University of California, San Diego for detection of hard-drive failure. We have selected 24 attributes and obtain accuracy of about 90%. We compare machine-learning methods applied to a difficult real-world problem: predicting computer hard-drive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and non-parametrically distributed data. We develop a new algorithm HMM which is specifically designed for the low false-alarm case, and is shown to have promising performance. Other methods compared are support vector machines (SVMs), unsupervised clustering, and non-parametric statistical tests (rank-sum and reverse arrangements). The failure-prediction performance of the SVM, rank-sum and mi-NB algorithm is considerably better than the threshold method currently implemented in drives, while maintaining low false alarm rates [13]. Our results suggest that non-parametric statistical tests should be considered for learning problems involving detecting rare events.
AB - This paper illustrates the use of Hidden Markov Model (HMM) to model hard disk failure. The reason we use HMM is because HMM is a formal foundation for making probabilistic models of linear sequence 'labeling' problem. We use the database provided by University of California, San Diego for detection of hard-drive failure. We have selected 24 attributes and obtain accuracy of about 90%. We compare machine-learning methods applied to a difficult real-world problem: predicting computer hard-drive failure using attributes monitored internally by individual drives. The problem is one of detecting rare events in a time series of noisy and non-parametrically distributed data. We develop a new algorithm HMM which is specifically designed for the low false-alarm case, and is shown to have promising performance. Other methods compared are support vector machines (SVMs), unsupervised clustering, and non-parametric statistical tests (rank-sum and reverse arrangements). The failure-prediction performance of the SVM, rank-sum and mi-NB algorithm is considerably better than the threshold method currently implemented in drives, while maintaining low false alarm rates [13]. Our results suggest that non-parametric statistical tests should be considered for learning problems involving detecting rare events.
KW - detection
KW - hard disk
KW - hidden markov
UR - http://www.scopus.com/inward/record.url?scp=84868156530&partnerID=8YFLogxK
U2 - 10.1109/ICCSE.2012.6295014
DO - 10.1109/ICCSE.2012.6295014
M3 - Conference contribution
AN - SCOPUS:84868156530
SN - 9781467302425
T3 - ICCSE 2012 - Proceedings of 2012 7th International Conference on Computer Science and Education
SP - 3
EP - 8
BT - ICCSE 2012 - Proceedings of 2012 7th International Conference on Computer Science and Education
T2 - 2012 7th International Conference on Computer Science and Education, ICCSE 2012
Y2 - 14 July 2012 through 17 July 2012
ER -