香港中文大學(xué)多媒體實(shí)驗(yàn)室 | 開(kāi)源視頻目標(biāo)檢測(cè)&跟蹤平臺(tái)
從去年2020年,說(shuō)起目標(biāo)檢測(cè),大多數(shù)人也許會(huì)知道“MMDetection框架”。今天框架還是香港中文大學(xué)實(shí)驗(yàn)室貢獻(xiàn),首先我們說(shuō)下MMDetection框架,然后詳細(xì)介紹一體化視頻感知平臺(tái)“MMTracking”。
MMDetection V1.0版本發(fā)布以來(lái),就獲得很多用戶的喜歡,發(fā)布以來(lái),其中有不少有價(jià)值的建議,同時(shí)也有很多開(kāi)發(fā)者貢獻(xiàn)代碼,在2020年5月6日,發(fā)布了MMDetection V2.0。
經(jīng)過(guò)對(duì)模型各個(gè)組件的重構(gòu)和優(yōu)化,全面提升了MMDetection的速度和精度,達(dá)到了現(xiàn)有檢測(cè)框架中的最優(yōu)水平。通過(guò)更細(xì)粒度的模塊化設(shè)計(jì),MMDetection的任務(wù)拓展性大大增強(qiáng),成為了檢測(cè)相關(guān)項(xiàng)目的基礎(chǔ)平臺(tái)。同時(shí)對(duì)文檔和教程進(jìn)行了完善,增強(qiáng)用戶體驗(yàn)。
MMDetection中實(shí)現(xiàn)了RPN,F(xiàn)ast R-CNN,F(xiàn)aster R-CNN,Mask R-CNN等網(wǎng)絡(luò)和框架。先簡(jiǎn)單介紹一下和 Detectron 的對(duì)比:
performance 稍高
訓(xùn)練速度稍快
所需顯存稍小
但更重要的是,基于PyTorch和基于Caffe2的code相比,易用性是有代差的。成功安裝 Detectron的時(shí)間,大概可以裝好一打的mmdetection吧。
當(dāng)然Detectron有些優(yōu)勢(shì)也很明顯,作為第一個(gè)全面的detection codebase,加上FAIR的金字招牌,release的模型也比較全面。研究者也在努力擴(kuò)充model zoo,奈何人力和算力還是有很大差距,所以還需要時(shí)間。
具體說(shuō)說(shuō)上面提到的三個(gè)方面吧。首先是performance ,由于PyTorch官方model zoo里面的ResNet結(jié)構(gòu)和Detectron所用的ResNet有細(xì)微差別(mmdetection中可以通過(guò)backbone的style參數(shù)指定),導(dǎo)致模型收斂速度不一樣,所以用兩種結(jié)構(gòu)都跑了實(shí)驗(yàn),一般來(lái)說(shuō)在1x的lr schedule下Detectron的會(huì)高,但2x的結(jié)果PyTorch的結(jié)構(gòu)會(huì)比較高。
速度方面Mask R-CNN差距比較大,其余的很小。采用相同的setting,Detectron每個(gè)iteration需要0.89s,而mmdetection只需要0.69s。Fast R-CNN比較例外,比Detectron的速度稍慢。另外在自己的服務(wù)器上跑Detectron會(huì)比官方report的速度慢20%左右,猜測(cè)是FB的Big Basin服務(wù)器性能比研究者好?
顯存方面優(yōu)勢(shì)比較明顯,會(huì)小30%左右。但這個(gè)和框架有關(guān),不完全是codebase優(yōu)化的功勞。一個(gè)讓研究者比較意外的結(jié)果是現(xiàn)在的codebase版本跑ResNet-50的Mask R-CNN,每張卡(12 G)可以放4張圖,比研究者比賽時(shí)候小了不少。
MMTracking
MMDetection是商湯科技(2018 COCO 目標(biāo)檢測(cè)挑戰(zhàn)賽冠軍)和香港中文大學(xué)開(kāi)源的一個(gè)基于Pytorch實(shí)現(xiàn)的深度學(xué)習(xí)目標(biāo)檢測(cè)工具箱。
新年2021年,香港中文大學(xué)多媒體實(shí)驗(yàn)室(MMLab)OpenMMLab 又研究并貢獻(xiàn)新的平臺(tái)工具,發(fā)布了一款一體化視頻目標(biāo)感知平臺(tái)MMTracking。該框架基于PyTorch寫成,支持單目標(biāo)跟蹤、多目標(biāo)跟蹤與視頻目標(biāo)檢測(cè),目前已開(kāi)源。我們開(kāi)始詳細(xì)分下下。
主要特征:
第一個(gè)統(tǒng)一的視頻感知平臺(tái)
MMLab是第一個(gè)統(tǒng)一多功能視頻感知任務(wù)的開(kāi)源工具箱,包括視頻目標(biāo)檢測(cè),單個(gè)目標(biāo)跟蹤,多個(gè)目標(biāo)跟蹤。
模塊化設(shè)計(jì)
MMLab將視頻感知框架分解成不同的組件,可以很容易地通過(guò)組合不同的模塊來(lái)構(gòu)建定制的方法。
Simple, Fast and Strong
Simple:MMTracking與其他Open MMLab項(xiàng)目交互。它是建立在MMDetection上的,通過(guò)修改配置文件選擇。
Fast:所有操作都運(yùn)行在GPU上。訓(xùn)練和推理速度比其他實(shí)現(xiàn)快。
Strong:性能超過(guò)最先進(jìn)的模型,其中一些模型甚至優(yōu)于官方的實(shí)現(xiàn)。
如何使用:
1、Create a conda virtual environment and activate it.
conda create -n open-mmlab python=3.7 -y
conda activate open-mmlab
2、Install PyTorch and torchvision following the official instructions, e.g.,
conda install pytorch torchvision -c pytorch
Note: Make sure that your compilation CUDA version and runtime CUDA version match. You can check the supported CUDA version for precompiled packages on the PyTorch website.
E.g.1 If you have CUDA 10.1 installed under /usr/local/cuda and would like to install PyTorch 1.5, you need to install the prebuilt PyTorch with CUDA 10.1.
conda install pytorch cudatoolkit=10.1 torchvision -c pytorch
E.g. 2 If you have CUDA 9.2 installed under /usr/local/cuda and would like to install PyTorch 1.3.1., you need to install the prebuilt PyTorch with CUDA 9.2.
conda install pytorch=1.3.1 cudatoolkit=9.2 torchvision=0.4.2 -c pytorch
If you build PyTorch from source instead of installing the prebuilt pacakge, you can use more CUDA versions such as 9.0.
3、Install mmcv-full, we recommend you to install the pre-build package as below.
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html
See here for different versions of MMCV compatible to different PyTorch and CUDA versions. Optionally you can choose to compile mmcv from source by the following command
git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
MMCV_WITH_OPS=1 pip install -e . # package mmcv-full will be installed after this step
cd ..
Or directly run
pip install mmcv-full
4、Install MMDetection
pip install mmdet
Optionally, you can also build MMDetection from source in case you want to modify the code:
git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e . # or "python setup.py develop"
5、Clone the MMTracking repository.
git clone https://github.com/open-mmlab/mmtracking.git
cd mmtracking
6、Install build requirements and then install MMTracking.
pip install -r requirements/build.txt
pip install -v -e . # or "python setup.py develop"
使用該平臺(tái)測(cè)試:
This section will show how to test existing models on supported datasets. The following testing environments are supported:
single GPU
single node multiple GPU
multiple nodes
During testing, different tasks share the same API and we only support samples_per_gpu = 1.
You can use the following commands for testing:
# single-gpu testing
python tools/test.py ${CONFIG_FILE} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${GPU_NUM} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
Optional arguments:
CHECKPOINT_FILE: Filename of the checkpoint. You do not need to define it when applying some MOT methods but specify the checkpoints in the config.
RESULT_FILE: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
EVAL_METRICS: Items to be evaluated on the results. Allowed values depend on the dataset, e.g., bbox is available for ImageNet VID, track is available for LaSOT, bbox and track are both suitable for MOT17.
--cfg-options: If specified, the key-value pair optional cfg will be merged into config file
--eval-options: If specified, the key-value pair optional eval cfg will be kwargs for dataset.evaluate() function, it’s only for evaluation
--format-only: If specified, the results will be formated to the offical format.
THE END
*博客內(nèi)容為網(wǎng)友個(gè)人發(fā)布,僅代表博主個(gè)人觀點(diǎn),如有侵權(quán)請(qǐng)聯(lián)系工作人員刪除。