Classification¶
Download train_imagenet.py
Download train_cifar10.py
Download train_mixup_cifar10.py

MXNet Pytorch

MXNet¶

Visualization of Inference Throughputs vs. Validation Accuracy of ImageNet pre-trained models is illustrated in the following graph. Throughputs are measured with single V100 GPU and batch size 64.

classification

How To Use Pretrained Models¶

The following example requires GluonCV>=0.4 and MXNet>=1.4.0. Please follow our installation guide to install or upgrade GluonCV and MXNet if necessary.
Prepare an image by yourself or use our sample image. You can save the image into filename classification-demo.png in your working directory or change the filename in the source codes if you use an another name.
Use a pre-trained model. A model is specified by its name.

Let’s try it out!

import mxnet as mx
import gluoncv

# you can change it to your image filename
filename = 'classification-demo.png'
# you may modify it to switch to another model. The name is case-insensitive
model_name = 'ResNet50_v1d'
# download and load the pre-trained model
net = gluoncv.model_zoo.get_model(model_name, pretrained=True)
# load image
img = mx.image.imread(filename)
# apply default data preprocessing
transformed_img = gluoncv.data.transforms.presets.imagenet.transform_eval(img)
# run forward pass to obtain the predicted score for each class
pred = net(transformed_img)
# map predicted values to probability by softmax
prob = mx.nd.softmax(pred)[0].asnumpy()
# find the 5 class indices with the highest score
ind = mx.nd.topk(pred, k=5)[0].astype('int').asnumpy().tolist()
# print the class name and predicted probability
print('The input picture is classified to be')
for i in range(5):
    print('- [%s], with probability %.3f.'%(net.classes[ind[i]], prob[ind[i]]))

The output from our sample image is expected to be

The input picture is classified to be
- [Welsh springer spaniel], with probability 0.899.
- [Irish setter], with probability 0.005.
- [Brittany spaniel], with probability 0.003.
- [cocker spaniel], with probability 0.002.
- [Blenheim spaniel], with probability 0.002.

Remember, you can try different models by replacing the value of model_name. Read further for model names and their performances in the tables.

ImageNet¶

Hint

Training commands work with this script:

A model can have differently trained parameters with different hashtags. Parameters with a grey name can be downloaded by passing the corresponding hashtag.

Download default pretrained weights: net = get_model('ResNet50_v1d', pretrained=True)
Download weights given a hashtag: net = get_model('ResNet50_v1d', pretrained='117a384e')

ResNet50_v1_int8 and MobileNet1.0_int8 are quantized model calibrated on ImageNet dataset.

ResNet¶

Hint

ResNet50_v1_int8 is a quantized model for ResNet50_v1.
ResNet_v1b modifies ResNet_v1 by setting stride at the 3x3 layer for a bottleneck block.
ResNet_v1c modifies ResNet_v1b by replacing the 7x7 conv layer with three 3x3 conv layers.
ResNet_v1d modifies ResNet_v1c by adding an avgpool layer 2x2 with stride 2 downsample feature map on the residual path to preserve more information.

Model	Top-1	Top-5	Hashtag	Training Command	Training Log
ResNet18_v1 1	70.93	89.92	a0666292	shell script	log
ResNet34_v1 1	74.37	91.87	48216ba9	shell script	log
ResNet50_v1 1	77.36	93.57	cc729d95	shell script	log
ResNet50_v1_int8 1	76.86	93.46	cc729d95
ResNet101_v1 1	78.34	94.01	d988c13d	shell script	log
ResNet152_v1 1	79.22	94.64	acfd0970	shell script	log
ResNet18_v1b 1	70.94	89.83	2d9d980c	shell script	log
ResNet34_v1b 1	74.65	92.08	8e16b848	shell script	log
ResNet50_v1b 1	77.67	93.82	0ecdba34	shell script	log
ResNet50_v1b_gn 1	77.36	93.59	0ecdba34	shell script	log
ResNet101_v1b 1	79.20	94.61	a455932a	shell script	log
ResNet152_v1b 1	79.69	94.74	a5a61ee1	shell script	log
ResNet50_v1c 1	78.03	94.09	2a4e0708	shell script	log
ResNet101_v1c 1	79.60	94.75	064858f2	shell script	log
ResNet152_v1c 1	80.01	94.96	75babab6	shell script	log
ResNet50_v1d 1	79.15	94.58	117a384e	shell script	log
ResNet50_v1d 1	78.48	94.20	00319ddc	shell script	log
ResNet101_v1d 1	80.51	95.12	1b2b825f	shell script	log
ResNet101_v1d 1	79.78	94.80	8659a9d6	shell script	log
ResNet152_v1d 1	80.61	95.34	cddbc86f	shell script	log
ResNet152_v1d 1	80.26	95.00	cfe0220d	shell script	log
ResNet18_v2 2	71.00	89.92	a81db45f	shell script	log
ResNet34_v2 2	74.40	92.08	9d6b80bb	shell script	log
ResNet50_v2 2	77.11	93.43	ecdde353	shell script	log
ResNet101_v2 2	78.53	94.17	18e93e4f	shell script	log
ResNet152_v2 2	79.21	94.31	f2695542	shell script	log

ResNext¶

Model	Top-1	Top-5	Hashtag	Training Command	Training Log
ResNext50_32x4d 12	79.32	94.53	4ecf62e2	shell script	log
ResNext101_32x4d 12	80.37	95.06	8654ca5d	shell script	log
ResNext101_64x4d_v1 12	80.69	95.17	2f0d1c9d	shell script	log
SE_ResNext50_32x4d 12 14	79.95	94.93	7906e0e1	shell script	log
SE_ResNext101_32x4d 12 14	80.91	95.39	688e2389	shell script	log
SE_ResNext101_64x4d 12 14	81.01	95.32	11c50114	shell script	log

ResNeSt¶

Model	Top-1	Top-5	Hashtag	Training Command	Training Log
ResNeSt14 17	75.75	92.70	7e0b0cae	shell script	log
ResNeSt26 17	78.68	94.38	36459074	shell script
ResNeSt50 17	81.04	95.42	bcfefe1d	shell script	log
ResNeSt101 17	82.83	96.42	5da943b3	shell script	log
ResNeSt200 17	83.86	96.86	0c5d117d	shell script	log
ResNeSt269 17	84.53	96.98	11ae7f5d	shell script	log

MobileNet¶

Hint

MobileNet1.0_int8 is a quantized model for MobileNet1.0.

Model	Top-1	Top-5	Hashtag	Training Command	Training Log
MobileNet1.0 4	73.28	91.30	efbb2ca3	shell script	log
MobileNet1.0_int8 4	72.85	90.99	efbb2ca3
MobileNet1.0 4	72.93	91.14	cce75496	shell script	log
MobileNet0.75 4	70.25	89.49	84c801e2	shell script	log
MobileNet0.5 4	65.20	86.34	0130d2aa	shell script	log
MobileNet0.25 4	52.91	76.94	f0046a3d	shell script	log
MobileNetV2_1.0 5	72.04	90.57	f9952bcd	shell script	log
MobileNetV2_0.75 5	69.36	88.50	b56e3d1c	shell script	log
MobileNetV2_0.5 5	64.43	85.31	08038185	shell script	log
MobileNetV2_0.25 5	51.76	74.89	9b1d2cc3	shell script	log
MobileNetV3_Large 15	75.32	92.30	eaa44578	shell script	log
MobileNetV3_Small 15	67.72	87.51	33c100a7	shell script	log

VGG¶

Model	Top-1	Top-5	Hashtag	Training Command	Training Log
VGG11 9	66.62	87.34	dd221b16
VGG13 9	67.74	88.11	6bc5de58
VGG16 9	73.23	91.31	e660d456	shell script	log
VGG19 9	74.11	91.35	ad2f660d	shell script	log
VGG11_bn 9	68.59	88.72	ee79a809
VGG13_bn 9	68.84	88.82	7d97a06c
VGG16_bn 9	73.10	91.76	7f01cf05	shell script	log
VGG19_bn 9	74.33	91.85	f360b758	shell script	log

SqueezeNet¶

Model	Top-1	Top-5	Hashtag	Training Command	Training Log
SqueezeNet1.0 10	56.11	79.09	264ba497
SqueezeNet1.1 10	54.96	78.17	33ba0f93

DenseNet¶

Model	Top-1	Top-5	Hashtag
DenseNet121 7	74.97	92.25	f27dbf2d
DenseNet161 7	77.70	93.80	b6c8a957
DenseNet169 7	76.17	93.17	2603f878
DenseNet201 7	77.32	93.62	1cdbc116

Pruned ResNet¶

Model	Top-1	Top-5	Hashtag	Speedup (to original ResNet)
resnet18_v1b_0.89	67.2	87.45	54f7742b	2x
resnet50_v1d_0.86	78.02	93.82	a230c33f	1.68x
resnet50_v1d_0.48	74.66	92.34	0d3e69bb	3.3x
resnet50_v1d_0.37	70.71	89.74	9982ae49	5.01x
resnet50_v1d_0.11	63.22	84.79	6a25eece	8.78x
resnet101_v1d_0.76	79.46	94.69	a872796b	1.8x
resnet101_v1d_0.73	78.89	94.48	712fccb1	2.02x

Others¶

Hint

InceptionV3 is evaluated with input size of 299x299.

Model	Top-1	Top-5	Hashtag	Training Command	Training Log
AlexNet 6	54.92	78.03	44335d1f
darknet53 3	78.56	94.43	2189ea49	shell script	log
darknet53 3	78.13	93.86	95975047	shell script	log
InceptionV3 8	78.77	94.39	a5050dbc	shell script	log
GoogLeNet 16	72.87	91.17	c7c89366	shell script	log
Xception 8	79.56	94.77	37c1c90b	shell script	log
InceptionV3 8	78.41	94.13	e132adf2	shell script	log
SENet_154 14	81.26	95.51	b5538ef1

CIFAR10¶

The following table lists pre-trained models trained on CIFAR10.

Hint

Our pre-trained models reproduce results from “Mix-Up” 13 . Please check the reference paper for further information.

Training commands in the table work with the following scripts:

For vanilla training:
For mix-up training:

Model	Acc (Vanilla/Mix-Up 13 )	Training Command	Training Log
CIFAR_ResNet20_v1 1	92.1 / 92.9	Vanilla / Mix-Up	Vanilla / Mix-Up
CIFAR_ResNet56_v1 1	93.6 / 94.2	Vanilla / Mix-Up	Vanilla / Mix-Up
CIFAR_ResNet110_v1 1	93.0 / 95.2	Vanilla / Mix-Up	Vanilla / Mix-Up
CIFAR_ResNet20_v2 2	92.1 / 92.7	Vanilla / Mix-Up	Vanilla / Mix-Up
CIFAR_ResNet56_v2 2	93.7 / 94.6	Vanilla / Mix-Up	Vanilla / Mix-Up
CIFAR_ResNet110_v2 2	94.3 / 95.5	Vanilla / Mix-Up	Vanilla / Mix-Up
CIFAR_WideResNet16_10 11	95.1 / 96.7	Vanilla / Mix-Up	Vanilla / Mix-Up
CIFAR_WideResNet28_10 11	95.6 / 97.2	Vanilla / Mix-Up	Vanilla / Mix-Up
CIFAR_WideResNet40_8 11	95.9 / 97.3	Vanilla / Mix-Up	Vanilla / Mix-Up
CIFAR_ResNeXt29_16x64d 12	96.3 / 97.3	Vanilla / Mix-Up	Vanilla / Mix-Up

Reference¶

1(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24): He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep residual learning for image recognition.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778. 2016.
2(1,2,3,4,5,6,7,8): He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Identity mappings in deep residual networks.” In European Conference on Computer Vision, pp. 630-645. Springer, Cham, 2016.
3(1,2): Redmon, Joseph, and Ali Farhadi. “Yolov3: An incremental improvement.” arXiv preprint arXiv:1804.02767 (2018).
4(1,2,3,4,5,6): Howard, Andrew G., Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. “Mobilenets: Efficient convolutional neural networks for mobile vision applications.” arXiv preprint arXiv:1704.04861 (2017).
5(1,2,3,4): Sandler, Mark, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. “Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation.” arXiv preprint arXiv:1801.04381 (2018).
6: Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” In Advances in neural information processing systems, pp. 1097-1105. 2012.
7(1,2,3,4): Huang, Gao, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. “Densely Connected Convolutional Networks.” In CVPR, vol. 1, no. 2, p. 3. 2017.
8(1,2,3): Szegedy, Christian, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. “Rethinking the inception architecture for computer vision.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2818-2826. 2016.
9(1,2,3,4,5,6,7,8): Karen Simonyan, Andrew Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition.” arXiv technical report arXiv:1409.1556 (2014).
10(1,2): Iandola, Forrest N., Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size.” arXiv preprint arXiv:1602.07360 (2016).
11(1,2,3): Zagoruyko, Sergey, and Nikos Komodakis. “Wide residual networks.” arXiv preprint arXiv:1605.07146 (2016).
12(1,2,3,4,5,6,7): Xie, Saining, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. “Aggregated residual transformations for deep neural networks.” In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pp. 5987-5995. IEEE, 2017.
13(1,2): Zhang, Hongyi, Moustapha Cisse, Yann N. Dauphin, and David Lopez-Paz. “mixup: Beyond empirical risk minimization.” arXiv preprint arXiv:1710.09412 (2017).
14(1,2,3,4): Hu, Jie, Li Shen, and Gang Sun. “Squeeze-and-excitation networks.” arXiv preprint arXiv:1709.01507 7 (2017).
15(1,2): Howard, Andrew, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang et al. “Searching for mobilenetv3.” arXiv preprint arXiv:1905.02244 (2019).
16: Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich “Going Deeper with Convolutions” arXiv preprint arXiv:1409.4842 (2014).
17(1,2,3,4,5,6): Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Zhi Zhang, Haibin Lin, Yue Sun, Tong He, Jonas Muller, R. Manmatha, Mu Li and Alex Smola “ResNeSt: Split-Attention Network” arXiv preprint (2020).

Classification¶file_downloadDownload train_imagenet.pyfile_downloadDownload train_cifar10.pyfile_downloadDownload train_mixup_cifar10.py

MXNet¶

How To Use Pretrained Models¶

ImageNet¶

ResNet¶

ResNext¶

ResNeSt¶

MobileNet¶

VGG¶

SqueezeNet¶

DenseNet¶

Pruned ResNet¶

Others¶

CIFAR10¶

PyTorch¶

Reference¶

Classification¶
Download train_imagenet.py
Download train_cifar10.py
Download train_mixup_cifar10.py