03. Train classifier or detector with HPO using GluonCV Auto task

The previous image classification example shows the basic usages to train/evaluate/predict using estimators provided by gluoncv.auto.estimators. Similarly, you can train object detectors using SSDEstimator, YOLOv3Estimator, CenterNetEstimator, FasterRCNNEstimator.

In this tutorial, we will move forward a little bit, into the hyper-parameter tunning space! We will show you how to offload an experiment to the backend HPO searcher, to offer better result if computational cost is abundant.

HPO with GluonCV auto tasks

from gluoncv.auto.tasks.image_classification import ImageClassification
from gluoncv.auto.tasks.object_detection import ObjectDetection
import autogluon.core as ag

In this tutorial, we use a small sample dataset for object detection For image classification, please refer to 02. Train Image Classification with Auto Estimator.

train = ObjectDetection.Dataset.from_voc(
    'https://autogluon.s3.amazonaws.com/datasets/tiny_motorbike.zip')
train, val, test = train.random_split(val_size=0.1, test_size=0.1)

Out:

tiny_motorbike/
├── Annotations/
├── ImageSets/
└── JPEGImages/

Define search space

We show a minimal example to run HPO for object detection. For image classification, the code change is negalected, just swap the dataset and ObjectDetection to ImageClassification.

We use very conservative search space to reduce CI runtime, and we will cap the number of trials to be 1 to reduce building time, feel free to adjust the num_trials, time_limits, epochs and tune other search spaces with ag.Categorical, ag.Int, ag.Real for example in order to achieve better results

time_limits = 60 * 60  # 1hr
search_args = {'lr': ag.Categorical(1e-3, 1e-2),
               'num_trials': 1,
               'epochs': 2,
               'num_workers': 16,
               'batch_size': ag.Categorical(4, 8),
               'ngpus_per_trial': 1,
               'search_strategy': 'random',
               'time_limits': time_limits}

Construct a object detection task based on the config.

task = ObjectDetection(search_args)

Automatically fit a model.

detector = task.fit(train, val)

Out:

/usr/local/lib/python3.6/dist-packages/mxnet/gluon/block.py:1512: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:
        data: None
  input_sym_arg_type = in_param.infer_type()[0]
Downloading /root/.mxnet/models/ssd_512_resnet50_v1_coco-c4835162.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/ssd_512_resnet50_v1_coco-c4835162.zip...

  0%|          | 0/181188 [00:00<?, ?KB/s]
  0%|          | 744/181188 [00:00<00:38, 4742.65KB/s]
  2%|1         | 3265/181188 [00:00<00:12, 14496.29KB/s]
  4%|4         | 7264/181188 [00:00<00:07, 24821.80KB/s]
  6%|6         | 11479/181188 [00:00<00:05, 29376.38KB/s]
 10%|9         | 17239/181188 [00:00<00:04, 38699.70KB/s]
 13%|#2        | 23408/181188 [00:00<00:03, 46070.32KB/s]
 16%|#6        | 29047/181188 [00:00<00:03, 47305.55KB/s]
 20%|#9        | 35840/181188 [00:00<00:02, 53511.65KB/s]
 23%|##3       | 41757/181188 [00:00<00:02, 55211.40KB/s]
 26%|##6       | 47479/181188 [00:01<00:02, 53473.40KB/s]
 30%|##9       | 54053/181188 [00:01<00:02, 57045.14KB/s]
 34%|###3      | 60861/181188 [00:01<00:01, 60287.14KB/s]
 37%|###6      | 66946/181188 [00:01<00:01, 57148.86KB/s]
 41%|####      | 73876/181188 [00:01<00:01, 60615.72KB/s]
 45%|####4     | 80724/181188 [00:01<00:01, 62892.08KB/s]
 48%|####8     | 87070/181188 [00:01<00:01, 59504.29KB/s]
 52%|#####1    | 93617/181188 [00:01<00:01, 61191.30KB/s]
 55%|#####5    | 100454/181188 [00:01<00:01, 63256.93KB/s]
 59%|#####8    | 106832/181188 [00:02<00:01, 60233.42KB/s]
 63%|######2   | 113933/181188 [00:02<00:01, 63285.45KB/s]
 67%|######6   | 121097/181188 [00:02<00:00, 65691.76KB/s]
 70%|#######   | 127719/181188 [00:02<00:00, 61579.61KB/s]
 74%|#######4  | 134604/181188 [00:02<00:00, 63616.24KB/s]
 78%|#######7  | 141068/181188 [00:02<00:00, 61144.08KB/s]
 82%|########1 | 147708/181188 [00:02<00:00, 62619.30KB/s]
 85%|########5 | 154276/181188 [00:02<00:00, 63494.75KB/s]
 89%|########8 | 160668/181188 [00:02<00:00, 63221.09KB/s]
 92%|#########2| 167020/181188 [00:03<00:00, 60069.59KB/s]
 96%|#########5| 173710/181188 [00:03<00:00, 62004.35KB/s]
 99%|#########9| 179980/181188 [00:03<00:00, 60517.08KB/s]
181189KB [00:03, 55855.12KB/s]
/usr/local/lib/python3.6/dist-packages/mxnet/gluon/block.py:1512: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:
        data: None
  input_sym_arg_type = in_param.infer_type()[0]

Evaluate the final model on test set.

test_map = detector.evaluate(test)
print("mAP on test dataset:")
for category, score in zip(*test_map):
    print(category, score)

Out:

mAP on test dataset:
dog 0.33333333333333326
bus 0.7727272727272727
pottedplant 0.0
boat 1.0000000000000002
car 0.6219794152569272
bicycle 0.2922077922077922
chair 0.0
motorbike 0.7450342739643517
cow 0.6363636363636365
person 0.510118200657477
mAP 0.49117639245107914

Save our final model.

detector.save('detector.pkl')

load the model and run some prediction

detector2 = ObjectDetection.load('detector.pkl')
pred = detector2.predict(test.iloc[0]['image'])
print(pred)

Out:

/usr/local/lib/python3.6/dist-packages/mxnet/gluon/block.py:1512: UserWarning: Cannot decide type for the following arguments. Consider providing them as input:
        data: None
  input_sym_arg_type = in_param.infer_type()[0]
   predict_class  ...                                       predict_rois
0      motorbike  ...  {'xmin': 0.2201734036207199, 'ymin': 0.4476159...
1         person  ...  {'xmin': 0.7222769856452942, 'ymin': 0.0945754...
2         person  ...  {'xmin': 0.6901475787162781, 'ymin': 0.1245557...
3         person  ...  {'xmin': 0.07806681096553802, 'ymin': 0.503342...
4      motorbike  ...  {'xmin': 0.7261181473731995, 'ymin': 0.3339981...
..           ...  ...                                                ...
95        person  ...  {'xmin': 0.8658402562141418, 'ymin': 0.8196606...
96        person  ...  {'xmin': 0.4964284598827362, 'ymin': 0.4723062...
97        person  ...  {'xmin': 0.11569126695394516, 'ymin': 0.536165...
98     motorbike  ...  {'xmin': 0.11499068886041641, 'ymin': 0.515896...
99        person  ...  {'xmin': 0.9769891500473022, 'ymin': 0.3428815...

[100 rows x 3 columns]

Pin the object detector algorithms

By default, the scheduler choose algorithms from SSDEstimator, YOLOv3Estimator, CenterNetEstimator, FasterRCNNEstimator. If you want to train a specific algorithm during HPO, you may pin the estimator.

For example, this tells the searcher to use YOLOv3 models only

search_args = {'lr': ag.Categorical(1e-3, 1e-2),
               'num_trials': 1,
               'estimator': 'yolo3',
               'epochs': 2,
               'num_workers': 16,
               'batch_size': ag.Categorical(4, 8),
               'ngpus_per_trial': 1,
               'search_strategy': 'random',
               'time_limits': time_limits}

Specify models for transfer learning

Allowing transfer learning from previous trained models is the key to fast convergence and high performance model with very limited training data. By default, we enable transfer learning from object detection networks trained on COCO dataset, downloaded automatically from gluoncv model zoo.

The selected pre-trained models by default is ‘ssd_512_resnet50_v1_coco’, ‘yolo3_darknet53_coco’, ‘faster_rcnn_resnet50_v1b_coco’ and ‘center_net_resnet50_v1b_coco’.

You may continue to use them, or replace with your favarite models, e.g.,

search_args = {'lr': ag.Categorical(1e-3, 1e-2),
               'num_trials': 1,
               'transfer': ag.Categorical('ssd_512_mobilenet1.0_coco',
                                          'yolo3_mobilenet1.0_coco'),
               'epochs': 10,
               'num_workers': 16,
               'batch_size': ag.Categorical(4, 8),
               'ngpus_per_trial': 1,
               'search_strategy': 'random',
               'time_limits': time_limits}

Now you are ready to use HPO feature in gluoncv.auto! Stay tuned as we are rolling out more features and tutorials for faster and more convenient training/test/inference experiences.

Total running time of the script: ( 1 minutes 10.238 seconds)

Gallery generated by Sphinx-Gallery