人体关键点检测的Mask R-CNN网络模型改进研究Research on Improved Mask R-CNN Network Model for Human Keypoint Detection
宋玲;夏智敏;
摘要(Abstract):
由于在现有的人体关键点检测问题中,深度学习解决方案采用的掩膜区域卷积神经网络Mask R-CNN存在参数量大导致计算成本过高、迭代次数多导致训练时间过长等问题,提出了一种基于重组通道网络ShuffleNet改进Mask R-CNN网络模型。通过引入ShuffleNet的网络结构,使用分组逐点卷积与通道重排的操作与联合边框回归和掩膜分割的计算结果对Mask R-CNN进行轻量化改进。使用该方法改进网络模型在进行单人或多人情况下的人体关键点检测中,在保留精度的前提下,可以加快运行速度,减少检测时间。
关键词(KeyWords): 深度学习;卷积神经网络(CNN);掩膜区域卷积神经网络(Mask R-CNN);重组通道网络;人体关键点检测
基金项目(Foundation): 国家自然科学基金(61762030);; 广西创新驱动重大专项项目(桂科AA17204017);; 广西重点研发计划项目(桂科AB19110050,桂科AB16380237)
作者(Author): 宋玲;夏智敏;
Email:
DOI:
参考文献(References):
- [1]胡晓彤,田仁赞,王旭迎.基于SURF特征点的金属罐图案检测算法[J].天津科技大学学报,2015,30(6):72-77.
- [2]刘非非.基于视频监控的室内跌倒行为的检测与识别研究[D].济南:山东大学,2016.
- [3]陶莹.K均值聚类算法的研究与分析[J].计算机技术与发展,2018,28(6):90-92.
- [4] GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2014:580-587.
- [5] HE K,GKIOXARI G,DOLLáR P,et al.Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:2961-2969.
- [6] ZHANG X,ZHOU X,LIN M,et al.Shufflenet:An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:6848-6856.
- [7] PFISTER T,CHARLES J,ZISSERMAN A.Flowing convnets for human pose estimation in videos[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:1913-1921.
- [8]党宇,张继贤,邓喀中,等.基于深度学习AlexNet的遥感影像地表覆盖分类评价研究[J].地球信息科学学报,2017,19(11):1530-1537.
- [9] WEI S E,RAMAKRISHNA V,KANADE T,et al.Convolutional pose machines[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:4724-4732.
- [10] FANG H S,XIE S,TAI Y W,et al.Rmpe:Regional multiperson pose estimation[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:2334-2343.
- [11] VINYALS O,TOSHEV A,BENGIO S,et al.Show and Tell:Lessons learned from the 2015 MSCOCO image captioning challenge[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2016,39(4):652-663.
- [12] RAJCHL M,LEE M C H,OKTAY O,et al.Deepcut:Object segmentation from bounding box annotations using convolutional neural networks[J].IEEE Transactions on Medical Imaging,2016,36(2):674-683.
- [13] INSAFUTDINOV E,PISHCHULIN L,ANDRES B,et al.Deepercut:A deeper,stronger,and faster multi-person pose estimation model[C]//Proceedings of European Conference on Computer Vision,2016:34-50.
- [14] REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[C]//Advances in Neural Information Processing Systems,2015:91-99.
- [15]寇大磊,权冀川,张仲伟.基于深度学习的目标检测框架进展研究[J].计算机工程与应用,2019,55(11):25-34.
- [16]童靖然,毛力,孙俊.特征金字塔融合的多模态行人检测算法[J].计算机工程与应用,2019,55(19):214-221.
- [17]谢林江,季桂树,彭清.改进的卷积神经网络在行人检测中的应用[J].计算机科学与探索,2018,12(5):708-718.
- [18] LI J,WONG H C,LO S L,et al.Multiple object detection by a deformable part-based model and an R-CNN[J].IEEE Signal Processing Letters,2018,25(2):288-292.
- [19] PUNN N S,AGARWAL S.Crowd analysis for congestion control early warning system on foot over bridge[C]//Proceedings of the 2019 Twelfth International Conference on Contemporary Computing(IC3),2019:1-6.
- [20] LIU W,LIAO S,HU W.Efficient single-stage pedestrian detector by asymptotic localization fitting and multi-scale context encoding[J].IEEE Transactions on Image Processing,2019,29:1413-1425.
- [21]杜鹏,宋永红,张鑫瑶.基于自注意力模态融合网络的跨模态行人再识别方法研究[J/OL].自动化学报[2019-09-16].htttps://doi.org/10.16383/j.aas.c190340.
- [22]徐守坤,邱亮,李宁,等.基于HOG-CSLBP及YOLOv2的行人检测[J].计算机工程与设计,2019,40(10):2964-2968.
- [23] GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:1440-1448.
- [24] LIN T Y,PIOTR D,ROSS G,et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision,2017:2117-2125.
- [25] HOWARD A G,ZHU M,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
- [26] SANDLER M,HOWARD A,ZHU M,et al.Mobilenetv2:Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:4510-4520.