Yolo輕量級(jí)網(wǎng)絡(luò),超輕算法在各硬件可實(shí)現(xiàn)工業(yè)級(jí)檢測(cè)效果(附源代碼)
目標(biāo)檢測(cè)是現(xiàn)在最熱門(mén)的研究課題,也一直是工業(yè)界重點(diǎn)研究的對(duì)象,最近幾年內(nèi),也出現(xiàn)了各種各樣的檢測(cè)框架,所屬于YOLO系列是最經(jīng)典也是目前被大家認(rèn)可使用的檢測(cè)框架。不論是PyTorch,還是Tensorflow,又或者是Keras和Caffe,可以說(shuō)是全平臺(tái)通用。
Yolo-Fastest開(kāi)源代碼:https://github.com/dog-qiuqiu/Yolo-Fastest
1 前言&背景
目標(biāo)檢測(cè)是現(xiàn)在最熱門(mén)的研究課題,也一直是工業(yè)界重點(diǎn)研究的對(duì)象,最近幾年內(nèi),也出現(xiàn)了各種各樣的檢測(cè)框架,所屬于YOLO系列是最經(jīng)典也是目前被大家認(rèn)可使用的檢測(cè)框架。
今天說(shuō)的這個(gè)系列模型,模型非常小、目前最快的YOLO算法——大小只有1.4MB,單核每秒148幀,在一些移動(dòng)設(shè)備上部署特別容易。具體測(cè)試效果如下:
2 框架介紹
簡(jiǎn)單使用了下Yolo-Fastest,感覺(jué)不是很習(xí)慣使用了,可能好就不用darknet框架,但是上手還是比較容易,github也有簡(jiǎn)單教程:
測(cè)試Demo的方式也有:
Demo on image input
# *Note: change .data , .cfg , .weights and input image file in image_yolov3.sh for Yolo-Fastest-x1, Yolov3 and Yolov4 sh image_yolov3.sh
Demo on video input
# *Note: Use any input video and place in the data folder or use 0 in the video_yolov3.sh for webcam # *Note: change .data , .cfg , .weights and input video file in video_yolov3.sh for Yolo-Fastest-x1, Yolov3 and Yolov4 sh video_yolov3.sh
中文介紹:https://zhuanlan.zhihu.com/p/234506503
與AlexeyAB/darknet相比,此版本darknet修復(fù)了一些老架構(gòu)GPU中分組卷積推理異常耗時(shí)的問(wèn)題(例如1050ti:40ms->4ms加速10倍),強(qiáng)烈推薦使用這個(gè) 訓(xùn)練模型的倉(cāng)庫(kù)框架
Darknet CPU推理效率優(yōu)化不好,不推薦使用Darknet作為CPU端推理框架,推薦使用ncnn
自己嘗試使用了下,確實(shí)darknet不是很友好,下次我試試NCNN的效果。下面是網(wǎng)絡(luò)的參數(shù):
Input data 0 1 data -23330=4,3,320,320,3 0=320 1=320 2=3 Convolution 0_22 1 1 data 0_22_bn_leaky -23330=4,3,160,160,8 0=8 1=3 3=2 4=1 5=1 6=216 9=2 -23310=1,1.000000e-01 Convolution 1_31 1 1 0_22_bn_leaky 1_31_bn_leaky -23330=4,3,160,160,8 0=8 1=1 5=1 6=64 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 2_39 1 1 1_31_bn_leaky 2_39_bn_leaky -23330=4,3,160,160,8 0=8 1=3 4=1 5=1 6=72 7=8 9=2 -23310=1,1.000000e-01 Convolution 3_48 1 1 2_39_bn_leaky 3_48_bn -23330=4,3,160,160,4 0=4 1=1 5=1 6=32 Split 3_48_bn_split 1 2 3_48_bn 3_48_bn_split_0 3_48_bn_split_1 -23330=8,3,160,160,4,3,160,160,4 Convolution 4_57 1 1 3_48_bn_split_0 4_57_bn_leaky -23330=4,3,160,160,8 0=8 1=1 5=1 6=32 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 5_65 1 1 4_57_bn_leaky 5_65_bn_leaky -23330=4,3,160,160,8 0=8 1=3 4=1 5=1 6=72 7=8 9=2 -23310=1,1.000000e-01 Convolution 6_74 1 1 5_65_bn_leaky 6_74_bn -23330=4,3,160,160,4 0=4 1=1 5=1 6=32 Eltwise 8_86 2 1 6_74_bn 3_48_bn_split_1 8_86 -23330=4,3,160,160,4 0=1 Convolution 9_90 1 1 8_86 9_90_bn_leaky -23330=4,3,160,160,24 0=24 1=1 5=1 6=96 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 10_98 1 1 9_90_bn_leaky 10_98_bn_leaky -23330=4,3,80,80,24 0=24 1=3 3=2 4=1 5=1 6=216 7=24 9=2 -23310=1,1.000000e-01 Convolution 11_107 1 1 10_98_bn_leaky 11_107_bn -23330=4,3,80,80,8 0=8 1=1 5=1 6=192 Split 11_107_bn_split 1 2 11_107_bn 11_107_bn_split_0 11_107_bn_split_1 -23330=8,3,80,80,8,3,80,80,8 Convolution 12_116 1 1 11_107_bn_split_0 12_116_bn_leaky -23330=4,3,80,80,32 0=32 1=1 5=1 6=256 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 13_124 1 1 12_116_bn_leaky 13_124_bn_leaky -23330=4,3,80,80,32 0=32 1=3 4=1 5=1 6=288 7=32 9=2 -23310=1,1.000000e-01 Convolution 14_133 1 1 13_124_bn_leaky 14_133_bn -23330=4,3,80,80,8 0=8 1=1 5=1 6=256 Eltwise 16_145 2 1 14_133_bn 11_107_bn_split_1 16_145 -23330=4,3,80,80,8 0=1 Split 16_145_split 1 2 16_145 16_145_split_0 16_145_split_1 -23330=8,3,80,80,8,3,80,80,8 Convolution 17_149 1 1 16_145_split_0 17_149_bn_leaky -23330=4,3,80,80,32 0=32 1=1 5=1 6=256 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 18_157 1 1 17_149_bn_leaky 18_157_bn_leaky -23330=4,3,80,80,32 0=32 1=3 4=1 5=1 6=288 7=32 9=2 -23310=1,1.000000e-01 Convolution 19_166 1 1 18_157_bn_leaky 19_166_bn -23330=4,3,80,80,8 0=8 1=1 5=1 6=256 Eltwise 21_179 2 1 19_166_bn 16_145_split_1 21_179 -23330=4,3,80,80,8 0=1 Convolution 22_183 1 1 21_179 22_183_bn_leaky -23330=4,3,80,80,32 0=32 1=1 5=1 6=256 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 23_191 1 1 22_183_bn_leaky 23_191_bn_leaky -23330=4,3,40,40,32 0=32 1=3 3=2 4=1 5=1 6=288 7=32 9=2 -23310=1,1.000000e-01 Convolution 24_200 1 1 23_191_bn_leaky 24_200_bn -23330=4,3,40,40,8 0=8 1=1 5=1 6=256 Split 24_200_bn_split 1 2 24_200_bn 24_200_bn_split_0 24_200_bn_split_1 -23330=8,3,40,40,8,3,40,40,8 Convolution 25_209 1 1 24_200_bn_split_0 25_209_bn_leaky -23330=4,3,40,40,48 0=48 1=1 5=1 6=384 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 26_217 1 1 25_209_bn_leaky 26_217_bn_leaky -23330=4,3,40,40,48 0=48 1=3 4=1 5=1 6=432 7=48 9=2 -23310=1,1.000000e-01 Convolution 27_226 1 1 26_217_bn_leaky 27_226_bn -23330=4,3,40,40,8 0=8 1=1 5=1 6=384 Eltwise 29_238 2 1 27_226_bn 24_200_bn_split_1 29_238 -23330=4,3,40,40,8 0=1 Split 29_238_split 1 2 29_238 29_238_split_0 29_238_split_1 -23330=8,3,40,40,8,3,40,40,8 Convolution 30_242 1 1 29_238_split_0 30_242_bn_leaky -23330=4,3,40,40,48 0=48 1=1 5=1 6=384 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 31_250 1 1 30_242_bn_leaky 31_250_bn_leaky -23330=4,3,40,40,48 0=48 1=3 4=1 5=1 6=432 7=48 9=2 -23310=1,1.000000e-01 Convolution 32_259 1 1 31_250_bn_leaky 32_259_bn -23330=4,3,40,40,8 0=8 1=1 5=1 6=384 Eltwise 34_273 2 1 32_259_bn 29_238_split_1 34_273 -23330=4,3,40,40,8 0=1 Convolution 35_277 1 1 34_273 35_277_bn_leaky -23330=4,3,40,40,48 0=48 1=1 5=1 6=384 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 36_285 1 1 35_277_bn_leaky 36_285_bn_leaky -23330=4,3,40,40,48 0=48 1=3 4=1 5=1 6=432 7=48 9=2 -23310=1,1.000000e-01 Convolution 37_294 1 1 36_285_bn_leaky 37_294_bn -23330=4,3,40,40,16 0=16 1=1 5=1 6=768 Split 37_294_bn_split 1 2 37_294_bn 37_294_bn_split_0 37_294_bn_split_1 -23330=8,3,40,40,16,3,40,40,16 Convolution 38_303 1 1 37_294_bn_split_0 38_303_bn_leaky -23330=4,3,40,40,96 0=96 1=1 5=1 6=1536 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 39_311 1 1 38_303_bn_leaky 39_311_bn_leaky -23330=4,3,40,40,96 0=96 1=3 4=1 5=1 6=864 7=96 9=2 -23310=1,1.000000e-01 Convolution 40_320 1 1 39_311_bn_leaky 40_320_bn -23330=4,3,40,40,16 0=16 1=1 5=1 6=1536 Eltwise 42_332 2 1 40_320_bn 37_294_bn_split_1 42_332 -23330=4,3,40,40,16 0=1 Split 42_332_split 1 2 42_332 42_332_split_0 42_332_split_1 -23330=8,3,40,40,16,3,40,40,16 Convolution 43_336 1 1 42_332_split_0 43_336_bn_leaky -23330=4,3,40,40,96 0=96 1=1 5=1 6=1536 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 44_344 1 1 43_336_bn_leaky 44_344_bn_leaky -23330=4,3,40,40,96 0=96 1=3 4=1 5=1 6=864 7=96 9=2 -23310=1,1.000000e-01 Convolution 45_353 1 1 44_344_bn_leaky 45_353_bn -23330=4,3,40,40,16 0=16 1=1 5=1 6=1536 Eltwise 47_365 2 1 45_353_bn 42_332_split_1 47_365 -23330=4,3,40,40,16 0=1 Split 47_365_split 1 2 47_365 47_365_split_0 47_365_split_1 -23330=8,3,40,40,16,3,40,40,16 Convolution 48_369 1 1 47_365_split_0 48_369_bn_leaky -23330=4,3,40,40,96 0=96 1=1 5=1 6=1536 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 49_377 1 1 48_369_bn_leaky 49_377_bn_leaky -23330=4,3,40,40,96 0=96 1=3 4=1 5=1 6=864 7=96 9=2 -23310=1,1.000000e-01 Convolution 50_386 1 1 49_377_bn_leaky 50_386_bn -23330=4,3,40,40,16 0=16 1=1 5=1 6=1536 Eltwise 52_399 2 1 50_386_bn 47_365_split_1 52_399 -23330=4,3,40,40,16 0=1 Split 52_399_split 1 2 52_399 52_399_split_0 52_399_split_1 -23330=8,3,40,40,16,3,40,40,16 Convolution 53_403 1 1 52_399_split_0 53_403_bn_leaky -23330=4,3,40,40,96 0=96 1=1 5=1 6=1536 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 54_411 1 1 53_403_bn_leaky 54_411_bn_leaky -23330=4,3,40,40,96 0=96 1=3 4=1 5=1 6=864 7=96 9=2 -23310=1,1.000000e-01 Convolution 55_420 1 1 54_411_bn_leaky 55_420_bn -23330=4,3,40,40,16 0=16 1=1 5=1 6=1536 Eltwise 57_433 2 1 55_420_bn 52_399_split_1 57_433 -23330=4,3,40,40,16 0=1 Convolution 58_437 1 1 57_433 58_437_bn_leaky -23330=4,3,40,40,96 0=96 1=1 5=1 6=1536 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 59_445 1 1 58_437_bn_leaky 59_445_bn_leaky -23330=4,3,20,20,96 0=96 1=3 3=2 4=1 5=1 6=864 7=96 9=2 -23310=1,1.000000e-01 Convolution 60_454 1 1 59_445_bn_leaky 60_454_bn -23330=4,3,20,20,24 0=24 1=1 5=1 6=2304 Split 60_454_bn_split 1 2 60_454_bn 60_454_bn_split_0 60_454_bn_split_1 -23330=8,3,20,20,24,3,20,20,24 Convolution 61_463 1 1 60_454_bn_split_0 61_463_bn_leaky -23330=4,3,20,20,136 0=136 1=1 5=1 6=3264 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 62_471 1 1 61_463_bn_leaky 62_471_bn_leaky -23330=4,3,20,20,136 0=136 1=3 4=1 5=1 6=1224 7=136 9=2 -23310=1,1.000000e-01 Convolution 63_480 1 1 62_471_bn_leaky 63_480_bn -23330=4,3,20,20,24 0=24 1=1 5=1 6=3264 Eltwise 65_492 2 1 63_480_bn 60_454_bn_split_1 65_492 -23330=4,3,20,20,24 0=1 Split 65_492_split 1 2 65_492 65_492_split_0 65_492_split_1 -23330=8,3,20,20,24,3,20,20,24 Convolution 66_496 1 1 65_492_split_0 66_496_bn_leaky -23330=4,3,20,20,136 0=136 1=1 5=1 6=3264 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 67_504 1 1 66_496_bn_leaky 67_504_bn_leaky -23330=4,3,20,20,136 0=136 1=3 4=1 5=1 6=1224 7=136 9=2 -23310=1,1.000000e-01 Convolution 68_513 1 1 67_504_bn_leaky 68_513_bn -23330=4,3,20,20,24 0=24 1=1 5=1 6=3264 Eltwise 70_526 2 1 68_513_bn 65_492_split_1 70_526 -23330=4,3,20,20,24 0=1 Split 70_526_split 1 2 70_526 70_526_split_0 70_526_split_1 -23330=8,3,20,20,24,3,20,20,24 Convolution 71_530 1 1 70_526_split_0 71_530_bn_leaky -23330=4,3,20,20,136 0=136 1=1 5=1 6=3264 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 72_538 1 1 71_530_bn_leaky 72_538_bn_leaky -23330=4,3,20,20,136 0=136 1=3 4=1 5=1 6=1224 7=136 9=2 -23310=1,1.000000e-01 Convolution 73_547 1 1 72_538_bn_leaky 73_547_bn -23330=4,3,20,20,24 0=24 1=1 5=1 6=3264 Eltwise 75_559 2 1 73_547_bn 70_526_split_1 75_559 -23330=4,3,20,20,24 0=1 Split 75_559_split 1 2 75_559 75_559_split_0 75_559_split_1 -23330=8,3,20,20,24,3,20,20,24 Convolution 76_563 1 1 75_559_split_0 76_563_bn_leaky -23330=4,3,20,20,136 0=136 1=1 5=1 6=3264 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 77_571 1 1 76_563_bn_leaky 77_571_bn_leaky -23330=4,3,20,20,136 0=136 1=3 4=1 5=1 6=1224 7=136 9=2 -23310=1,1.000000e-01 Convolution 78_580 1 1 77_571_bn_leaky 78_580_bn -23330=4,3,20,20,24 0=24 1=1 5=1 6=3264 Eltwise 80_593 2 1 78_580_bn 75_559_split_1 80_593 -23330=4,3,20,20,24 0=1 Split 80_593_split 1 2 80_593 80_593_split_0 80_593_split_1 -23330=8,3,20,20,24,3,20,20,24 Convolution 81_597 1 1 80_593_split_0 81_597_bn_leaky -23330=4,3,20,20,136 0=136 1=1 5=1 6=3264 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 82_605 1 1 81_597_bn_leaky 82_605_bn_leaky -23330=4,3,10,10,136 0=136 1=3 3=2 4=1 5=1 6=1224 7=136 9=2 -23310=1,1.000000e-01 Convolution 83_615 1 1 82_605_bn_leaky 83_615_bn -23330=4,3,10,10,48 0=48 1=1 5=1 6=6528 Split 83_615_bn_split 1 2 83_615_bn 83_615_bn_split_0 83_615_bn_split_1 -23330=8,3,10,10,48,3,10,10,48 Convolution 84_624 1 1 83_615_bn_split_0 84_624_bn_leaky -23330=4,3,10,10,224 0=224 1=1 5=1 6=10752 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 85_632 1 1 84_624_bn_leaky 85_632_bn_leaky -23330=4,3,10,10,224 0=224 1=3 4=1 5=1 6=2016 7=224 9=2 -23310=1,1.000000e-01 Convolution 86_641 1 1 85_632_bn_leaky 86_641_bn -23330=4,3,10,10,48 0=48 1=1 5=1 6=10752 Eltwise 88_653 2 1 86_641_bn 83_615_bn_split_1 88_653 -23330=4,3,10,10,48 0=1 Split 88_653_split 1 2 88_653 88_653_split_0 88_653_split_1 -23330=8,3,10,10,48,3,10,10,48 Convolution 89_657 1 1 88_653_split_0 89_657_bn_leaky -23330=4,3,10,10,224 0=224 1=1 5=1 6=10752 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 90_665 1 1 89_657_bn_leaky 90_665_bn_leaky -23330=4,3,10,10,224 0=224 1=3 4=1 5=1 6=2016 7=224 9=2 -23310=1,1.000000e-01 Convolution 91_674 1 1 90_665_bn_leaky 91_674_bn -23330=4,3,10,10,48 0=48 1=1 5=1 6=10752 Eltwise 93_686 2 1 91_674_bn 88_653_split_1 93_686 -23330=4,3,10,10,48 0=1 Split 93_686_split 1 2 93_686 93_686_split_0 93_686_split_1 -23330=8,3,10,10,48,3,10,10,48 Convolution 94_690 1 1 93_686_split_0 94_690_bn_leaky -23330=4,3,10,10,224 0=224 1=1 5=1 6=10752 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 95_698 1 1 94_690_bn_leaky 95_698_bn_leaky -23330=4,3,10,10,224 0=224 1=3 4=1 5=1 6=2016 7=224 9=2 -23310=1,1.000000e-01 Convolution 96_707 1 1 95_698_bn_leaky 96_707_bn -23330=4,3,10,10,48 0=48 1=1 5=1 6=10752 Eltwise 98_719 2 1 96_707_bn 93_686_split_1 98_719 -23330=4,3,10,10,48 0=1 Split 98_719_split 1 2 98_719 98_719_split_0 98_719_split_1 -23330=8,3,10,10,48,3,10,10,48 Convolution 99_723 1 1 98_719_split_0 99_723_bn_leaky -23330=4,3,10,10,224 0=224 1=1 5=1 6=10752 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 100_731 1 1 99_723_bn_leaky 100_731_bn_leaky -23330=4,3,10,10,224 0=224 1=3 4=1 5=1 6=2016 7=224 9=2 -23310=1,1.000000e-01 Convolution 101_740 1 1 100_731_bn_leaky 101_740_bn -23330=4,3,10,10,48 0=48 1=1 5=1 6=10752 Eltwise 103_752 2 1 101_740_bn 98_719_split_1 103_752 -23330=4,3,10,10,48 0=1 Split 103_752_split 1 2 103_752 103_752_split_0 103_752_split_1 -23330=8,3,10,10,48,3,10,10,48 Convolution 104_756 1 1 103_752_split_0 104_756_bn_leaky -23330=4,3,10,10,224 0=224 1=1 5=1 6=10752 9=2 -23310=1,1.000000e-01 ConvolutionDepthWise 105_764 1 1 104_756_bn_leaky 105_764_bn_leaky -23330=4,3,10,10,224 0=224 1=3 4=1 5=1 6=2016 7=224 9=2 -23310=1,1.000000e-01 Convolution 106_773 1 1 105_764_bn_leaky 106_773_bn -23330=4,3,10,10,48 0=48 1=1 5=1 6=10752 Eltwise 108_784 2 1 106_773_bn 103_752_split_1 108_784 -23330=4,3,10,10,48 0=1 Split 108_784_split 1 4 108_784 108_784_split_0 108_784_split_1 108_784_split_2 108_784_split_3 -23330=16,3,10,10,48,3,10,10,48,3,10,10,48,3,10,10,48 Pooling 109_788 1 1 108_784_split_0 109_788 -23330=4,3,10,10,48 1=3 3=1 5=1 Pooling 111_795 1 1 108_784_split_1 111_795 -23330=4,3,10,10,48 1=5 3=2 5=1 Pooling 113_802 1 1 108_784_split_2 113_802 -23330=4,3,10,10,48 1=9 3=4 5=1 Concat 114_806 4 1 113_802 111_795 109_788 108_784_split_3 114_806 -23330=4,3,10,10,192 Convolution 115_811 1 1 114_806 115_811_bn_leaky -23330=4,3,10,10,96 0=96 1=1 5=1 6=18432 9=2 -23310=1,1.000000e-01 Split 115_811_bn_leaky_split 1 2 115_811_bn_leaky 115_811_bn_leaky_split_0 115_811_bn_leaky_split_1 -23330=8,3,10,10,96,3,10,10,96 ConvolutionDepthWise 116_819 1 1 115_811_bn_leaky_split_0 116_819_bn_leaky -23330=4,3,10,10,96 0=96 1=5 4=2 5=1 6=2400 7=96 9=2 -23310=1,1.000000e-01 Convolution 117_828 1 1 116_819_bn_leaky 117_828_bn -23330=4,3,10,10,96 0=96 1=1 5=1 6=9216 ConvolutionDepthWise 118_836 1 1 117_828_bn 118_836_bn_leaky -23330=4,3,10,10,96 0=96 1=5 4=2 5=1 6=2400 7=96 9=2 -23310=1,1.000000e-01 Convolution 119_845 1 1 118_836_bn_leaky 119_845_bn -23330=4,3,10,10,96 0=96 1=1 5=1 6=9216 Convolution 120_854 1 1 119_845_bn 120_854 -23330=4,3,10,10,18 0=18 1=1 5=1 6=1728 Interp 123_882 1 1 115_811_bn_leaky_split_1 123_882 -23330=4,3,20,20,96 0=1 1=2.000000e+00 2=2.000000e+00 Concat 124_885 2 1 123_882 80_593_split_1 124_885 -23330=4,3,20,20,120 ConvolutionDepthWise 125_888 1 1 124_885 125_888_bn_leaky -23330=4,3,20,20,120 0=120 1=5 4=2 5=1 6=3000 7=120 9=2 -23310=1,1.000000e-01 Convolution 126_897 1 1 125_888_bn_leaky 126_897_bn -23330=4,3,20,20,120 0=120 1=1 5=1 6=14400 ConvolutionDepthWise 127_905 1 1 126_897_bn 127_905_bn_leaky -23330=4,3,20,20,120 0=120 1=5 4=2 5=1 6=3000 7=120 9=2 -23310=1,1.000000e-01 Convolution 128_914 1 1 127_905_bn_leaky 128_914_bn -23330=4,3,20,20,120 0=120 1=1 5=1 6=14400 Convolution 129_922 1 1 128_914_bn 129_922 -23330=4,3,20,20,18 0=18 1=1 5=1 6=2160 Yolov3DetectionOutput detection_out 2 1 120_854 129_922 output -23330=4,2,6,631,1 0=1 1=3 2=6.500000e-01 -23304=12,7.000000e+00,1.700000e+01,2.000000e+01,5.000000e+01,4.500000e+01,9.900000e+01,6.400000e+01,1.870000e+02,1.230000e+02,2.110000e+02,2.270000e+02,2.640000e+02 -23305=6,1077936128,1082130432,1084227584,0,1065353216,1073741824 -23306=2,3.200000e+01,1.600000e+01
檢測(cè)效果杠杠的,而且與其他網(wǎng)絡(luò)做了對(duì)比,結(jié)果如下:
3 升級(jí)版:Yolo-FastestV2
Yolo-Fastest注重的就是單核的實(shí)時(shí)推理性能,在滿(mǎn)足實(shí)時(shí)的條件下的低CPU占用,不單單只是能在手機(jī)移動(dòng)端達(dá)到實(shí)時(shí),還要在RK3399,樹(shù)莓派4以及多種Cortex-A53低成本低功耗設(shè)備上滿(mǎn)足一定實(shí)時(shí)性,畢竟這些嵌入式的設(shè)備相比與移動(dòng)端手機(jī)要弱很多,但是使用更加廣泛,成本更加低廉。
總結(jié)下新框架的特性:
· 簡(jiǎn)單、快速、緊湊、易于移植
· 資源占用少,單核性能優(yōu)異,功耗更低
· 更快更?。阂?.3%的精度損失換取30%的推理速度提升,減少25%的參數(shù)量
· 訓(xùn)練速度快,算力要求低,訓(xùn)練只需要3GB顯存,gtx1660ti訓(xùn)練COCO 1 epoch僅需4分鐘
主要摘自于《https://zhuanlan.zhihu.com/p/400474142 》
首先模型的backbone替換為了shufflenetV2,相比原先的backbone,訪存減少了一些,更加輕量,其次Anchor的匹配機(jī)制,參考的YOLOV5;其次是檢測(cè)頭的解耦合,這個(gè)也是參考YoloX的,將檢測(cè)框的回歸,前景背景的分類(lèi)以及檢測(cè)類(lèi)別的分類(lèi)有yolo的一個(gè)特征圖解耦成3個(gè)不同的特征圖,其中前景背景的分類(lèi)以及檢測(cè)類(lèi)別的分類(lèi)采用同一網(wǎng)絡(luò)分支參數(shù)共享。最后將檢測(cè)類(lèi)別分類(lèi)的loss由sigmoid替換為softmax。
Yolo-FastestV2還是只有輸出11x11和22x22兩個(gè)尺度的檢測(cè)頭,因?yàn)榘l(fā)現(xiàn)在coco上三個(gè)檢測(cè)頭(11x11,22x22,44x44)和兩個(gè)檢測(cè)頭(11x11,22x22)的精度無(wú)太大差異,個(gè)人感覺(jué)原因如下:
backbone對(duì)應(yīng)44x44分辨率的特征圖太少
正負(fù)anchor的嚴(yán)重不平衡
小物體屬于難樣本對(duì)于模型學(xué)習(xí)能力要求高
大家還是關(guān)心最終的實(shí)驗(yàn)結(jié)果:
測(cè)試平臺(tái):Mate 30 Kirin 990 CPU,NCNN
與yolox和nanoDet的對(duì)比,精度肯定比不過(guò), 不過(guò)速度會(huì)****倍,那體積只有 1.3M 的 PP-YOLO Tiny,用int8的量化后體積和yolo-fastest的fp32的體積比,YOLO-FastestV2 int8可是僅僅只有250kb,雖然沒(méi)跑過(guò)PP-YOLO Tiny,但是應(yīng)該還是比他快。所以,模型的選擇還是看大家需求。RK3399和樹(shù)莓派4搭配ncnn bf16s,YOLO-FastestV2 是可以實(shí)時(shí)的。
*博客內(nèi)容為網(wǎng)友個(gè)人發(fā)布,僅代表博主個(gè)人觀點(diǎn),如有侵權(quán)請(qǐng)聯(lián)系工作人員刪除。