Skip to content

csiro-funml/PPAL

 
 

Repository files navigation

Run on HPC

Background

  • PPAL only supports distributed GPU training and does not run on CPU.
  • However, PPAL is based on a old version of MMDetection, which is further based on a old version of OpenMMLab and PyTorch (1.12).
  • These dependencies are not compatible with the Nvidia H100 GPUs and the CUDA version in HPC.
  • Luckily, we can build a compatible PyTorch 1.12 from source.
  • This confluence page gives a step-by-step guide to build PyTorch 1.12 from source for HPC
  • Furthermore, to run PPAL on HPC, we need to modify the local PyTorch and MMDetection code.
  • To avoid going through this lengthy and painful process, we a self-contained Conda environment in HPC, which is ready to run PPAL.

Setup HPC environment

  • We have a shared project folder /scratch3/projects/runway_safety where the Conda environment, PPAL code and datasets are stored.
  • You can setup the environment by
        # Load Miniconda
        module load miniconda3/
        # Load Cuda
        module load cuda/11.8.0
        module load cudnn/8.7.0.84-cu11
        # Activate Conda environment
        source /scratch3/projects/runway_safety/ppal_env/bin/deactivate
  • If you want to deactivate the environment, you can run
        source /scratch3/projects/runway_safety/ppal_env/bin/deactivate
  • If you want to make changes to the environment, please create a local clone of the environment and make changes to the local clone.
        cp -r /scratch3/projects/runway_safety/ppal_env /path/to/local/ppal_env
        source /path/to/local/ppal_env/bin/deactivate
  • If you want to make changes to the PPAL code, please commit your changes under /scratch3/projects/runway_safety/git/PPAL and create a pull request to the main branch.

Run PPAL on Slurm

  • PPAL is designed to train distributedly on GPUs, so we use Slurm to manage the training jobs.
  • We provide a Slurm script /scratch3/projects/runway_safety/slurm_train.sh to run PPAL on Slurm, please create a local copy of the script and modify it to run your own experiments.
  • The script can be run as follows
    sbatch /scratch3/projects/runway_safety/slurm_train.sh
  • It takes ~6 hours to finish, and the results will be saved in /scratch3/projects/runway_safety/work_dirs
  • For some reasons, PPAL only runs no 2 GPUs, using more GPUs causes runtime error. Not sure why...

Below are the original readme from PPAL


Plug and Play Active Learning for Object Detection

PyTorch implementation of our paper: Plug and Play Active Learning for Object Detection

Requirements

  • Our codebase is built on top of MMDetection, which can be installed following the offcial instuctions.

Usage

Installation

python setup.py install

Setup dataset

  • Place your dataset as the following structure (Only vital files are shown). It should be easy because it's the default MMDetection data placement)
PPAL
|
`-- data
    |
    |--coco
    |   |
    |   |--train2017
    |   |--val2017
    |   `--annotations
    |      |
    |      |--instances_train2017.json
    |      `--instances_val2017.json
    `-- VOCdevkit
        |
        |--VOC2007
        |  |
        |  |--ImageSets
        |  |--JPEGImages
        |  `--Annotations
        `--VOC2012
           |--ImageSets
           |--JPEGImages
           `--Annotations
  • For convenience, we use COCO style annotation for Pascal VOC active learning. Please download trainval_0712.json.
  • Set up active learning datasets
zsh tools/al_data/data_setup.sh /path/to/trainval_0712.json
  • The above command will set up a new Pascal VOC data folder. It will also generate three different active learning initial annotations for both dataset, where the COCO initial sets contain 2% of the original annotated images, and the Pascal VOC initial sets contains 5% of the original annotated images.
  • The resulted file structure is as following
PPAL
|
`-- data
    |
    |--coco
    |   |
    |   |--train2017
    |   |--val2017
    |   `--annotations
    |      |
    |      |--instances_train2017.json
    |      `--instances_val2017.json
    |--VOCdevkit
    |   |
    |   |--VOC2007
    |   |  |
    |   |  |--ImageSets
    |   |  |--JPEGImages
    |   |  `--Annotations
    |   `--VOC2012
    |      |--ImageSets
    |      |--JPEGImages
    |      `--Annotations
    |--VOC0712
    |  |
    |  |--images
    |  |--annotations
    |     |
    |     `--trainval_0712.json
    `--active_learning
       |
       |--coco
       |  |
       |  |--coco_2365_labeled_1.json
       |  |--coco_2365_unlabeled_1.json
       |  |--coco_2365_labeled_2.json
       |  |--coco_2365_unlabeled_2.json
       |  |--coco_2365_labeled_3.json
       |  `--coco_2365_unlabeled_3.json
       `--voc
          |
          |--voc_827_labeled_1.json
          |--voc_827_unlabeled_1.json
          |--voc_827_labeled_2.json
          |--voc_827_unlabeled_2.json
          |--voc_827_labeled_3.json
          `--voc_827_unlabeled_3.json

Run active learning

  • You can run active learning using a single command with a config file. For example, you can run COCO and Pascal VOC RetinaNet experiments by
python tools/run_al_coco.py --config al_configs/coco/ppal_retinanet_coco.py --model retinanet
python tools/run_al_voc.py --config al_configs/voc/ppal_retinanet_voc.py --model retinanet
  • Please check the config file to set up the data paths and environment settings before running the experiments.

Citation

@InProceedings{yang2024ppal,
    author    = {{Yang, Chenhongyi and Huang, Lichao and Crowley, Elliot J.}},
    title     = {{Plug and Play Active Learning for Object Detection}},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2024}
}

About

[CVPR 2024] Plug and Play Active Learning for Object Detection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.9%
  • Shell 0.1%