Algorithm
Image Classification-ResNet_v1_50
ModelArts
33 months ago
418MB 350 47
  • Category
    PapersWithCode
  • Assets Id 7f0b4d7d-0329-4812-8b8b-35f457cab344

Description

ResNet_v1_50 (Image Classification/TensorFlow)

OBS raw dataset format

Unsplitted dataset

your_local_path/flowers
    |- classA   ------------- Class A
        |- A_0.jpg
        |- A_1.jpg
        |- ...
        |- A_n.jpg
    |- classB    ------------ Class B
        |- B_0.jpg
        |- B_1.jpg
        |- ...
        |- B_n.jpg
    |- classC --------------- Class C
        |- C_0.jpg
        |- C_1.jpg
        |- ...
        |- C_n.jpg
The others follow the same rule.
    ...

Splitted dataset

your_local_path/flowers
    |- train --------------- Training set
  |- classA   ------------- Class A
   |- A_0.jpg
   |- A_1.jpg
   |- ...
   |- A_n.jpg
  |- classB    ------------ Class B
   |- B_0.jpg
   |- B_1.jpg
   |- ...
   |- B_n.jpg
  |- classC --------------- Class C
   |- C_0.jpg
   |- C_1.jpg
   |- ...
   |- C_n.jpg
The others follow the same rule.
  ...
		
 |- eval --------------- Test dataset
  |- classA   ------------- Class A
   |- A_0.jpg
   |- A_1.jpg
   |- ...
   |- A_n.jpg
  |- classB    ------------ Class B
   |- B_0.jpg
   |- B_1.jpg
   |- ...
   |- B_n.jpg
  |- classC --------------- Class C
   |- C_0.jpg
   |- C_1.jpg
   |- ...
   |- C_n.jpg
The others follow the same rule.
  ...

1. Overview

This is a ResNet model with 50 layers. It is implemented based on the model structure proposed in the Deep Residual Learning for Image Recognition (https://arxiv.org/abs/1512.03385). It can be used for image classification, such as cat and dog classification and flower classification. You provide a series of labeled datasets. The algorithm loads the pre-trained model of ImageNet-1000 (http://www.image-net.org/) for transfer learning on your datasets. The trained model can be directly deployed as a real-time service on the ModelArts platform. In addition, CPU and GPU flavors can be used for inference. You can also convert models into the Ascend type and deploy them on Ascend 310 for inference.

2. Training

2.1. Basic Algorithm Information

  • Task type: image classification
  • Supported engine: TensorFlow 1.13.1-Python 3.6-horovod
  • Test card: NVIDIA V100
  • Performance:
batch_size (images/batch) FPS (images/second)
64 978
128 1104
256 1220
  • Algorithm input:
    A training and validation ratio must be set, 0.8 or 0.9 is recommended, for the image classification dataset published on the ModelArts data management platform.
  • Pre-trained ImageNet model whose top 1 accuracy rate is 74.6% and top 5 accuracy rate is 92.0%
  • Algorithm output:
  • Model saved_model for TF-Serving inference. The inference speed of P4 is 11 ms/image.
  • Model frozen.pb used to be converted to the model in Ascend format. The inference speed of Ascend 310 is 3.2 ms/image.

2.2. Training Parameters

Parameter Default Value Type Mandatory Description
task_type image_classification_v2 String Yes Task type, which cannot be changed
model_name resnet_v1_50 String Yes Model name, which cannot be changed
do_train True Bool Yes Whether to train, which defaults to True and cannot not be changed
do_eval_along_train True Bool Yes Whether to perform validation along with training, which defaults to True and cannot be changed
variable_update horovod String Yes Parameter update formula, which defaults to horovod and cannot be changed
learning_rate_strategy 0.002 String Yes Learning rate for training. 10:0.01,20:0.001 indicates that the learning rate of the first 10 epochs is 0.01 and that of the next 10 epochs is 0.001. If epoch is not specified, the learning rate will be adjusted based on the validation precision. The training will be stopped if the precision is not significantly improved anymore. The value can be changed.
batch_size 64 int Yes Number of images trained in each batch (on a single card), which can be changed
eval_batch_size 64 int Yes Number of images validated in each batch (on a single card), which can be changed
evaluate_every_n_epochs 1.0 float Yes Validation is performed every n epochs. The value can be changed.
save_model_secs 60 int Yes Model saving frequency (unit: s), which can be changed
save_summary_steps 10 int Yes Summary saving frequency (unit: step), which can be changed
log_every_n_steps 10 int Yes Log printing frequency (unit: step), which can be changed
do_data_cleaning True Bool No Whether to perform data cleaning. Incorrect data formats will lead to training failures. Therefore, enable this function for training stability. If the data volume is too large, data cleaning may take a long time. You can clean data offline. (Formats such as BMP, JPEG, PNG, and RGB channels are supported.) The JPEG format is recommended. This parameter is set to True by default. Delete this parameter to disable it.
use_fp16 True Bool No Whether to use mixed precision. Mixed precision accelerates training but causes precision loss. Enable this parameter unless precision is strictly required. This parameter is set to True by default. Delete this parameter to disable it.
xla_compile True Bool No Whether to use XLA for accelerated training. This parameter is set to True by default. Delete this parameter to disable it.
data_format NCHW String No Input data format. NHWC indicates channel last, and NCHW indicates channel first.
best_model True Bool No Whether to save and use the model with the highest precision instead of the latest model during training. The default value is True, indicating that the optimal model is saved. Within a certain error range, the latest highest precision model is saved as the optimal model.
jpeg_preprocess True Bool No Whether to use the JPEG preprocessing acceleration operator (only JPEG data is supported) to accelerate data reading and improve performance. This parameter is set to True by default. If the data format is not JPEG, enable the data cleaning function. Delete this parameter to disable it.

2.3. Training Output File

After training is complete, the output file is as follows:

Training output directory
  |- om
    |- model
      |- index
      |- customize_service_d310.py
  |- model
    |- variables
      |- variables.data-00000-of-00001
      |- variables.index
    |- customize_service.py
    |- index
    |- config.json
    |- saved_model.pb
  |- frozen_graph
    |- insert_op_conf.cfg
    |- model.pb
  |- checkpoint
  |- model.ckpt-xxx
  |- ...
  |- best_checkpoint
  |- best_model.ckpt-xxx
  |- ...
  |- events...
  |- graph.pbtxt

3. Ascend 310 Inference

3.1. Model Conversion Parameters

  • Conversion Template: Select TF-FrozenGraph-To-Ascend-C32.
  • Conversion Input Path: Select frozen_graph in the training output path.
  • Conversion Output Path: Select om/model in the training output path.
  • input_shape: Enter images:1,224,224,3.
  • input_format: Select NHWC.
  • out_nodes: Enter logits:0.

Retain the default values of other parameters.

3.2. Model import parameters

  • Template: Select ARM-Ascend template.
  • Model Directory: Select om/model in the training output path.
  • Input and Output Mode: Select Built-in image processing.

4. GPU/CPU Inference

To deploy an inference service on CPUs or GPUs, set Meta Model Source to Training job and select a training job and version.

Note: In the inference configuration file model/config.json, the CPU inference image runtime:tf1.xx-python3.x-cpu is used by default.
To use GPU inference, change the runtime field to tf1.xx-python3.x-gpu in the model/config.json file before importing the model.

Publish

HUAWEI CLOUD ModelArts

CN-Hong KongAP-SingaporeME-Riyadhaf-north-1sa-brazil-1

You can use assets on the HUAWEI CLOUD ModelArts management console but cannot download them.

Restrictions

Public

Free

Free Duration

[

5 years

]

Auto-renew

Version

Version
Version ID
Published At
Status
Description
Constraints
10.0.0
I3DkY4
2022-06-10 06:33
Done
--

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more