Abstract
Hand gesture provides a means for human to interact through a series of gestures. While hand gesture plays a significant role in human–computer interaction, it also breaks down the communication barrier and simplifies communication process between the general public and the hearing-impaired community. This paper outlines a convolutional neural network (CNN) integrated with spatial pyramid pooling (SPP), dubbed CNN–SPP, for vision-based hand gesture recognition. SPP is discerned mitigating the problem found in conventional pooling by having multi-level pooling stacked together to extend the features being fed into a fully connected layer. Provided with inputs of varying sizes, SPP also yields a fixed-length feature representation. Extensive experiments have been conducted to scrutinize the CNN–SPP performance on two well-known American sign language (ASL) datasets and one NUS hand gesture dataset. Our empirical results disclose that CNN–SPP prevails over other deep learning-driven instances.
Original language | English |
---|---|
Pages (from-to) | 5339-5351 |
Number of pages | 13 |
Journal | Neural Computing and Applications |
Volume | 33 |
Issue number | 10 |
DOIs | |
Publication status | Published - May 2021 |
Externally published | Yes |
Keywords
- Convolutional neural network (CNN)
- Hand gesture recognition
- Sign language recognition
- Spatial pyramid pooling (SPP)
ASJC Scopus subject areas
- Software
- Artificial Intelligence