7. Test with ICNet Pre-trained Models for Multi-Human Parsing

This is a quick demo of using GluonCV ICNet model for multi-human parsing on real-world images. Please follow the installation guide to install MXNet and GluonCV if not yet.

import mxnet as mx
from mxnet import image
from mxnet.gluon.data.vision import transforms
import gluoncv
# using cpu
ctx = mx.cpu(0)

Prepare the image

Let’s first download the example image,

url = 'https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/segmentation/mhpv1_examples/1.jpg'
filename = 'mhp_v1_example.jpg'
gluoncv.utils.download(url, filename, True)

Out:

Downloading mhp_v1_example.jpg from https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/segmentation/mhpv1_examples/1.jpg...

  0%|          | 0/84 [00:00<?, ?KB/s]
85KB [00:00, 13981.56KB/s]

Then we load the image and visualize it,

img = image.imread(filename)

from matplotlib import pyplot as plt
plt.imshow(img.asnumpy())
plt.show()
demo icnet

We normalize the image using dataset mean and standard deviation,

from gluoncv.data.transforms.presets.segmentation import test_transform
img = test_transform(img, ctx)

Load the pre-trained model and make prediction

Next, we get a pre-trained model from our model zoo,

model = gluoncv.model_zoo.get_model('icnet_resnet50_mhpv1', pretrained=True)

Out:

Downloading /root/.mxnet/models/icnet_resnet50_mhpv1-873d381a.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/icnet_resnet50_mhpv1-873d381a.zip...

  0%|          | 0/185766 [00:00<?, ?KB/s]
  0%|          | 782/185766 [00:00<00:33, 5570.73KB/s]
  1%|          | 1676/185766 [00:00<00:30, 5941.50KB/s]
  1%|1         | 2652/185766 [00:00<00:28, 6383.87KB/s]
  2%|2         | 3740/185766 [00:00<00:26, 6872.44KB/s]
  3%|2         | 4924/185766 [00:00<00:23, 7540.38KB/s]
  3%|3         | 6252/185766 [00:00<00:22, 8002.65KB/s]
  4%|4         | 7692/185766 [00:00<00:19, 9026.93KB/s]
  5%|4         | 9276/185766 [00:01<00:18, 9742.88KB/s]
  6%|5         | 10788/185766 [00:01<00:15, 11070.51KB/s]
  6%|6         | 11980/185766 [00:01<00:16, 10784.47KB/s]
  8%|7         | 14012/185766 [00:01<00:14, 11974.83KB/s]
  9%|8         | 16252/185766 [00:01<00:13, 12840.09KB/s]
 10%|#         | 18716/185766 [00:01<00:11, 14225.23KB/s]
 12%|#1        | 21420/185766 [00:01<00:10, 15704.28KB/s]
 13%|#3        | 24396/185766 [00:02<00:09, 17345.09KB/s]
 15%|#4        | 27660/185766 [00:02<00:07, 20040.29KB/s]
 16%|#5        | 29708/185766 [00:02<00:07, 20146.16KB/s]
 18%|#7        | 33164/185766 [00:02<00:06, 23193.74KB/s]
 19%|#9        | 35524/185766 [00:02<00:06, 23141.50KB/s]
 21%|##1       | 39500/185766 [00:02<00:05, 25306.37KB/s]
 24%|##3       | 44268/185766 [00:02<00:05, 28031.62KB/s]
 27%|##6       | 49500/185766 [00:02<00:04, 31005.96KB/s]
 30%|##9       | 54975/185766 [00:03<00:03, 36855.49KB/s]
 32%|###1      | 58759/185766 [00:03<00:03, 35378.97KB/s]
 35%|###4      | 64666/185766 [00:03<00:02, 41578.85KB/s]
 37%|###7      | 68961/185766 [00:03<00:02, 40281.87KB/s]
 40%|####      | 74950/185766 [00:03<00:02, 45610.92KB/s]
 43%|####3     | 80108/185766 [00:03<00:02, 46757.27KB/s]
 47%|####6     | 86571/185766 [00:03<00:01, 51818.20KB/s]
 50%|#####     | 93596/185766 [00:03<00:01, 56172.65KB/s]
 54%|#####3    | 99917/185766 [00:03<00:01, 58197.64KB/s]
 58%|#####7    | 107500/185766 [00:03<00:01, 63336.93KB/s]
 61%|######1   | 113887/185766 [00:04<00:01, 63385.93KB/s]
 65%|######5   | 121435/185766 [00:04<00:00, 66957.39KB/s]
 69%|######8   | 128163/185766 [00:04<00:00, 65448.25KB/s]
 73%|#######2  | 135580/185766 [00:04<00:00, 67996.55KB/s]
 77%|#######6  | 142866/185766 [00:04<00:00, 69429.44KB/s]
 81%|########  | 149830/185766 [00:04<00:00, 67516.43KB/s]
 85%|########4 | 157692/185766 [00:04<00:00, 70236.62KB/s]
 89%|########8 | 164736/185766 [00:04<00:00, 68210.27KB/s]
 93%|#########2| 172242/185766 [00:04<00:00, 70184.53KB/s]
 97%|#########6| 179285/185766 [00:04<00:00, 69661.64KB/s]
100%|##########| 185766/185766 [00:05<00:00, 36539.85KB/s]

We directly make semantic predictions on the image,

output = model.predict(img)
predict = mx.nd.squeeze(mx.nd.argmax(output, 1)).asnumpy()

In the end, we add color pallete for visualizing the predicted mask,

from gluoncv.utils.viz import get_color_pallete
import matplotlib.image as mpimg
mask = get_color_pallete(predict, 'mhpv1')
mask.save('output.png')
mmask = mpimg.imread('output.png')
plt.imshow(mmask)
plt.show()
demo icnet

Total running time of the script: ( 0 minutes 7.662 seconds)

Gallery generated by Sphinx-Gallery