联合边界框校准的自然场景文本检测Natural Scene Text Detection Combined with Bounding Box Calibration
方承志;火兴龙;程宥铖;
摘要(Abstract):
针对自然场景下多方向文本对象,提出一种基于深度学习的文本检测方法。该方法在设计锚框时剥离锚框的方向特征但保留其长宽比特征,在覆盖相同长宽比范围时,锚框设计数量减少,从而缓解采样密集时正负样本类别失衡的影响。在方法的后处理阶段,提出一种边界框校准算法,该算法利用最大稳定极值区域(MSER)获取字符边缘信息,通过基于规则的逻辑判断,对边界框进行收缩或膨胀操作,从而达到边界框校准目的。通过在公开数据集ICDAR2015上的测试与比较,验证了所提边界框校准算法的有效性。
关键词(KeyWords): 文本检测;自然场景;类别失衡;边界框校准
基金项目(Foundation): 国家自然科学基金面上项目(61271334,61073115)
作者(Author): 方承志;火兴龙;程宥铖;
Email:
DOI:
参考文献(References):
- [1] SHI B,BAI X,YAO C.An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(11):2298-2304.
- [2] JUNG K,KIM K I,JAIN A K.Text information extraction in images and video:A survey[J].Pattern Recognition,2004,37(5):977-997.
- [3]穆亚昆,冯圣威,张静.基于文本与语义相关性分析的图像检索[J].计算机工程与应用,2019,55(1):196-202.
- [4] CHANG S L,CHEN L S,CHUNG Y C,et al.Automatic license plate recognition[J].IEEE Transactions on Intelligent Transportation Systems,2004,5(1):53.
- [5]谭台哲,卢剑彪,温捷文,等.应用卷积神经网络与RPN的交通标志识别[J].计算机工程与应用,2018,54(21):251-256.
- [6] TIAN Z,HUANG W,HE T,et al.Detecting text in natural image with connectionist text proposal network[C]//Proceedings of the 2016 European Conference on Computer Vision,2016:56-72.
- [7] LIAO M,SHI B,BAI X,et al.TextBoxes:A fast text detector with a single deep neural network[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2017:4161-4167.
- [8] SHI B G,BAI X,BELONGIE S.Detecting oriented text in natural images by linking segments[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition,2017:3482-3490.
- [9] ZHOU X,YAO C,WEN H,et al.EAST:An efficient and accurate scene text detector[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:2642-2651.
- [10] DENG D,LIU H,LI X,et al.PixelLink:Detecting scene text via Instance segmentation[J].arXiv:1801.01315,2018.
- [11] LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot multibox detector[C]//Proceedings of the 2016European Conference on Computer Vision,2016:21-37.
- [12] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
- [13] LIAO M,SHI B,XIANG B.TextBoxes++:A single-shot oriented scene text detector[J].IEEE Transactions on Image Processing,2018,27(8):3676-3690.
- [14] SHI C,WANG C,XIAO B,et al.Scene text detection using graph model built upon maximally stable extremal regions[J].Pattern Recognition Letters,2013,34(2):107-116.
- [15] REDMON J,FARHADI A.YOLOv3:An incremental improvement[J].arXiv:1804.02767,2018.
- [16] XU B,WANG N,CHEN T,et al.Empirical evaluation of rectified activations in convolutional network[J].arXiv:1505.00853,2015.
- [17] KARATZAS D,GOMEZ-BIGORDA L,NICOLAOU A,et al.ICDAR 2015 competition on robust reading[C]//Proceedings of the 13th International Conference on Document Analysis and Recognition,2015:1156-1160.
- [18] KINGMA D P,BA J.ADAM:A method for stochastic optimization[J].arXiv:1412.6980,2014.
- [19] DENG J,DONG W,SOCHER R,et al.ImageNet:A largescale hierarchical image database[C]//Proceedings of 2009IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2009.
- [20] SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training region-based object detectors with online hard example mining[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2016:761-769.