TY - GEN
T1 - DeepText
T2 - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
AU - Wang, Qingqing
AU - Jia, Wenjing
AU - He, Xiangjian
AU - Lu, Yue
AU - Blumenstein, Michael
AU - Huang, Ye
AU - Lyu, Shujing
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - In this paper, we address the issue of scene text detection in the way of direct regression and successfully adapt an effective semantic segmentation model, DeepLab v3+ [1], for this application. In order to handle texts with arbitrary orientations and sizes and improve the recall of small texts, we propose to extract features of multiple scales by inserting multiple Atrous Spatial Pyramid Pooling (ASPP) layers to the DeepLab after the feature maps with different resolutions. Then, we set multiple auxiliary IoU losses at the decoding stage and make auxiliary connections from the intermediate encoding layers to the decoder to assist network training and enhance the discrimination ability of lower encoding layers. Experiments conducted on the benchmark scene text dataset ICDAR2015 demonstrate the superior performance of our proposed network, named as DeepText, over the state-of-the-art approaches.
AB - In this paper, we address the issue of scene text detection in the way of direct regression and successfully adapt an effective semantic segmentation model, DeepLab v3+ [1], for this application. In order to handle texts with arbitrary orientations and sizes and improve the recall of small texts, we propose to extract features of multiple scales by inserting multiple Atrous Spatial Pyramid Pooling (ASPP) layers to the DeepLab after the feature maps with different resolutions. Then, we set multiple auxiliary IoU losses at the decoding stage and make auxiliary connections from the intermediate encoding layers to the decoder to assist network training and enhance the discrimination ability of lower encoding layers. Experiments conducted on the benchmark scene text dataset ICDAR2015 demonstrate the superior performance of our proposed network, named as DeepText, over the state-of-the-art approaches.
KW - Auxiliary IoU losses
KW - Auxiliary connections
KW - DeepLab
KW - Multiple ASPP
KW - Scene text detection
UR - http://www.scopus.com/inward/record.url?scp=85079890524&partnerID=8YFLogxK
U2 - 10.1109/ICDAR.2019.00042
DO - 10.1109/ICDAR.2019.00042
M3 - Conference contribution
AN - SCOPUS:85079890524
T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
SP - 208
EP - 213
BT - Proceedings - 15th IAPR International Conference on Document Analysis and Recognition, ICDAR 2019
PB - IEEE Computer Society
Y2 - 20 September 2019 through 25 September 2019
ER -