Service Tickets
Gain technical support
简体中文Unsplitted dataset
|- classA ------------- Class A
|- A_0.jpg
|- A_1.jpg
|- ...
|- A_n.jpg
|- classB ------------ Class B
|- B_0.jpg
|- B_1.jpg
|- ...
|- B_n.jpg
|- classC --------------- Class C
|- C_0.jpg
|- C_1.jpg
|- ...
|- C_n.jpg
The others follow the same rule.
Splitted dataset
|- train --------------- Training set
|- classA ------------- Class A
|- A_0.jpg
|- A_1.jpg
|- ...
|- A_n.jpg
|- classB ------------ Class B
|- B_0.jpg
|- B_1.jpg
|- ...
|- B_n.jpg
|- classC --------------- Class C
|- C_0.jpg
|- C_1.jpg
|- ...
|- C_n.jpg
The others follow the same rule.
|- eval --------------- Test dataset
|- classA ------------- Class A
|- A_0.jpg
|- A_1.jpg
|- ...
|- A_n.jpg
|- classB ------------ Class B
|- B_0.jpg
|- B_1.jpg
|- ...
|- B_n.jpg
|- classC --------------- Class C
|- C_0.jpg
|- C_1.jpg
|- ...
|- C_n.jpg
The others follow the same rule.
This is a ResNet model with 50 layers. It is implemented based on the model structure proposed in the Deep Residual Learning for Image Recognition ( It can be used for image classification, such as cat and dog classification and flower classification. You provide a series of labeled datasets. The algorithm loads the pre-trained model of ImageNet-1000 ( for transfer learning on your datasets. The trained model can be directly deployed as a real-time service on the ModelArts platform. In addition, CPU and GPU flavors can be used for inference. You can also convert models into the Ascend type and deploy them on Ascend 310 for inference.
batch_size (images/batch) | FPS (images/second) |
64 | 978 |
128 | 1104 |
256 | 1220 |
Parameter | Default Value | Type | Mandatory | Description |
task_type | image_classification_v2 | String | Yes | Task type, which cannot be changed |
model_name | resnet_v1_50 | String | Yes | Model name, which cannot be changed |
do_train | True | Bool | Yes | Whether to train, which defaults to True and cannot not be changed |
do_eval_along_train | True | Bool | Yes | Whether to perform validation along with training, which defaults to True and cannot be changed |
variable_update | horovod | String | Yes | Parameter update formula, which defaults to horovod and cannot be changed |
learning_rate_strategy | 0.002 | String | Yes | Learning rate for training. 10:0.01,20:0.001 indicates that the learning rate of the first 10 epochs is 0.01 and that of the next 10 epochs is 0.001. If epoch is not specified, the learning rate will be adjusted based on the validation precision. The training will be stopped if the precision is not significantly improved anymore. The value can be changed. |
batch_size | 64 | int | Yes | Number of images trained in each batch (on a single card), which can be changed |
eval_batch_size | 64 | int | Yes | Number of images validated in each batch (on a single card), which can be changed |
evaluate_every_n_epochs | 1.0 | float | Yes | Validation is performed every n epochs. The value can be changed. |
save_model_secs | 60 | int | Yes | Model saving frequency (unit: s), which can be changed |
save_summary_steps | 10 | int | Yes | Summary saving frequency (unit: step), which can be changed |
log_every_n_steps | 10 | int | Yes | Log printing frequency (unit: step), which can be changed |
do_data_cleaning | True | Bool | No | Whether to perform data cleaning. Incorrect data formats will lead to training failures. Therefore, enable this function for training stability. If the data volume is too large, data cleaning may take a long time. You can clean data offline. (Formats such as BMP, JPEG, PNG, and RGB channels are supported.) The JPEG format is recommended. This parameter is set to True by default. Delete this parameter to disable it. |
use_fp16 | True | Bool | No | Whether to use mixed precision. Mixed precision accelerates training but causes precision loss. Enable this parameter unless precision is strictly required. This parameter is set to True by default. Delete this parameter to disable it. |
xla_compile | True | Bool | No | Whether to use XLA for accelerated training. This parameter is set to True by default. Delete this parameter to disable it. |
data_format | NCHW | String | No | Input data format. NHWC indicates channel last, and NCHW indicates channel first. |
best_model | True | Bool | No | Whether to save and use the model with the highest precision instead of the latest model during training. The default value is True, indicating that the optimal model is saved. Within a certain error range, the latest highest precision model is saved as the optimal model. |
jpeg_preprocess | True | Bool | No | Whether to use the JPEG preprocessing acceleration operator (only JPEG data is supported) to accelerate data reading and improve performance. This parameter is set to True by default. If the data format is not JPEG, enable the data cleaning function. Delete this parameter to disable it. |
After training is complete, the output file is as follows:
Training output directory
|- om
|- model
|- index
|- model
|- variables
|- variables.index
|- index
|- config.json
|- saved_model.pb
|- frozen_graph
|- insert_op_conf.cfg
|- model.pb
|- checkpoint
|- model.ckpt-xxx
|- ...
|- best_checkpoint
|- best_model.ckpt-xxx
|- ...
|- events...
|- graph.pbtxt
Retain the default values of other parameters.
To deploy an inference service on CPUs or GPUs, set Meta Model Source to Training job and select a training job and version.
Note: In the inference configuration file model/config.json, the CPU inference image runtime:tf1.xx-python3.x-cpu is used by default.
To use GPU inference, change the runtime field to tf1.xx-python3.x-gpu in the model/config.json file before importing the model.
You can use assets on the HUAWEI CLOUD ModelArts management console but cannot download them.
Version | Version ID | Published At | Status | Description | Constraints |
10.0.0 | I3DkY4 | 2022-06-10 06:33 | Done | -- |
We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more