• [技术干货] Ascend310部署Qwen-VL-7B实现吸烟动作识别
    Ascend310部署Qwen-VL-7B实现吸烟动作识别OrangePi AI Studio Pro是基于2个昇腾310P处理器的新一代高性能推理解析卡,提供基础通用算力+超强AI算力,整合了训练和推理的全部底层软件栈,实现训推一体。其中AI半精度FP16算力约为176TFLOPS,整数Int8精度可达352TOPS,本文将带领大家在Ascend 310P上部署Qwen2.5-VL-7B多模态理解大模型实现吸烟动作的识别。一、环境配置我们在OrangePi AI Stuido上使用Docker容器部署MindIE:docker pull swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.1.RC1-300I-Duo-py311-openeuler24.03-ltsroot@orangepi:~# docker images REPOSITORY TAG IMAGE ID CREATED SIZE swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie 2.1.RC1-300I-Duo-py311-openeuler24.03-lts 0574b8d4403f 3 months ago 20.4GB langgenius/dify-web 1.0.1 b2b7363571c2 8 months ago 475MB langgenius/dify-api 1.0.1 3dd892f50a2d 8 months ago 2.14GB langgenius/dify-plugin-daemon 0.0.4-local 3f180f39bfbe 8 months ago 1.35GB ubuntu/squid latest dae40da440fe 8 months ago 243MB postgres 15-alpine afbf3abf6aeb 8 months ago 273MB nginx latest b52e0b094bc0 9 months ago 192MB swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie 1.0.0-300I-Duo-py311-openeuler24.03-lts 74a5b9615370 10 months ago 17.5GB redis 6-alpine 6dd588768b9b 10 months ago 30.2MB langgenius/dify-sandbox 0.2.10 4328059557e8 13 months ago 567MB semitechnologies/weaviate 1.19.0 8ec9f084ab23 2 years ago 52.5MB之后创建一个名为start-docker.sh的启动脚本,内容如下:NAME=$1 if [ $# -ne 1 ]; then echo "warning: need input container name.Use default: mindie" NAME=mindie fi docker run --name ${NAME} -it -d --net=host --shm-size=500g \ --privileged=true \ -w /usr/local/Ascend/atb-models \ --device=/dev/davinci_manager \ --device=/dev/hisi_hdc \ --device=/dev/devmm_svm \ --entrypoint=bash \ -v /models:/models \ -v /data:/data \ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ -v /usr/local/dcmi:/usr/local/dcmi \ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ -v /usr/local/sbin:/usr/local/sbin \ -v /home:/home \ -v /tmp:/tmp \ -v /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime \ -e http_proxy=$http_proxy \ -e https_proxy=$https_proxy \ -e "PATH=/usr/local/python3.11.6/bin:$PATH" \ swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.1.RC1-300I-Duo-py311-openeuler24.03-ltsbash start-docker.sh启动容器后,我们需要替换几个文件并安装Ascend-cann-nnal软件包:root@orangepi:~# docker exec -it mindie bash Welcome to 5.15.0-126-generic System information as of time: Sat Nov 15 22:06:48 CST 2025 System load: 1.87 Memory used: 6.3% Swap used: 0.0% Usage On: 33% Users online: 0 [root@orangepi atb-models]# cd /usr/local/Ascend/ascend-toolkit/8.2.RC1/lib64/ [root@orangepi lib64]# ls /data/fix_openeuler_docker/fixhccl/8.2hccl/ libhccl.so libhccl_alg.so libhccl_heterog.so libhccl_plf.so [root@orangepi lib64]# cp /data/fix_openeuler_docker/fixhccl/8.2hccl/* ./ cp: overwrite './libhccl.so'? cp: overwrite './libhccl_alg.so'? cp: overwrite './libhccl_heterog.so'? cp: overwrite './libhccl_plf.so'? [root@orangepi lib64]# source /usr/local/Ascend/ascend-toolkit/set_env.sh [root@orangepi lib64]# chmod +x /data/fix_openeuler_docker/Ascend-cann-nnal/Ascend-cann-nnal_8.3.RC1_linux-x86_64.run [root@orangepi lib64]# /data/fix_openeuler_docker/Ascend-cann-nnal/Ascend-cann-nnal_8.3.RC1_linux-x86_64.run --install --quiet [NNAL] [20251115-22:41:45] [INFO] LogFile:/var/log/ascend_seclog/ascend_nnal_install.log [NNAL] [20251115-22:41:45] [INFO] Ascend-cann-atb_8.3.RC1_linux-x86_64.run --install --install-path=/usr/local/Ascend/nnal --install-for-all --quiet --nox11 start WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv [NNAL] [20251115-22:41:58] [INFO] Ascend-cann-atb_8.3.RC1_linux-x86_64.run --install --install-path=/usr/local/Ascend/nnal --install-for-all --quiet --nox11 install success [NNAL] [20251115-22:41:58] [INFO] Ascend-cann-SIP_8.3.RC1_linux-x86_64.run --install --install-path=/usr/local/Ascend/nnal --install-for-all --quiet --nox11 start [NNAL] [20251115-22:41:59] [INFO] Ascend-cann-SIP_8.3.RC1_linux-x86_64.run --install --install-path=/usr/local/Ascend/nnal --install-for-all --quiet --nox11 install success [NNAL] [20251115-22:41:59] [INFO] Ascend-cann-nnal_8.3.RC1_linux-x86_64.run install success Warning!!! If the environment variables of atb and asdsip are set at the same time, unexpected consequences will occur. Import the corresponding environment variables based on the usage scenarios: atb for large model scenarios, asdsip for embedded scenarios. Please make sure that the environment variables have been configured. If you want to use atb module: - To take effect for current user, you can exec command below: source /usr/local/Ascend/nnal/atb/set_env.sh or add "source /usr/local/Ascend/nnal/atb/set_env.sh" to ~/.bashrc. If you want to use asdsip module: - To take effect for current user, you can exec command below: source /usr/local/Ascend/nnal/asdsip/set_env.sh or add "source /usr/local/Ascend/nnal/asdsip/set_env.sh" to ~/.bashrc. [root@orangepi lib64]# cat /usr/local/Ascend/nnal/atb/latest/version.info Ascend-cann-atb : 8.3.RC1 Ascend-cann-atb Version : 8.3.RC1.B106 Platform : x86_64 branch : 8.3.rc1-0702 commit id : 16004f23040e0dcdd3cf0c64ecf36622487038ba修改推理使用的逻辑NPU核心为0,1,测试多模态理解大模型:Qwen2.5-VL-7B-Instruct:运行结果表明,Qwen2.5-VL-7B-Instruct在2 x Ascned 310P上推理平均每秒可以输出20个tokens,同时准确理解画面中的人物信息和行为动作。[root@orangepi atb-models]# bash examples/models/qwen2_vl/run_pa.sh --model_path /models/Qwen2.5-VL-7B-Instruct/ --input_image /root/pic/test.jpg [2025-11-15 22:12:49,663] torch.distributed.run: [WARNING] [2025-11-15 22:12:49,663] torch.distributed.run: [WARNING] ***************************************** [2025-11-15 22:12:49,663] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. [2025-11-15 22:12:49,663] torch.distributed.run: [WARNING] ***************************************** /usr/local/lib64/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source? warn( /usr/local/lib64/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source? warn( 2025-11-15 22:12:53.250 7934 LLM log default format: [yyyy-mm-dd hh:mm:ss.uuuuuu] [processid] [threadid] [llmmodels] [loglevel] [file:line] [status code] msg 2025-11-15 22:12:53.250 7933 LLM log default format: [yyyy-mm-dd hh:mm:ss.uuuuuu] [processid] [threadid] [llmmodels] [loglevel] [file:line] [status code] msg [2025-11-15 22:12:53.250] [7934] [139886327420160] [llmmodels] [WARN] [model_factory.cpp:28] deepseekV2_DecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7933] [139649439929600] [llmmodels] [WARN] [model_factory.cpp:28] deepseekV2_DecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7934] [139886327420160] [llmmodels] [WARN] [model_factory.cpp:28] deepseekV2_DecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7933] [139649439929600] [llmmodels] [WARN] [model_factory.cpp:28] deepseekV2_DecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7934] [139886327420160] [llmmodels] [WARN] [model_factory.cpp:28] llama_LlamaDecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7933] [139649439929600] [llmmodels] [WARN] [model_factory.cpp:28] llama_LlamaDecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:55,335] [7934] [139886327420160] [llmmodels] [INFO] [cpu_binding.py-254] : rank_id: 1, device_id: 1, numa_id: 0, shard_devices: [0, 1], cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] [2025-11-15 22:12:55,336] [7934] [139886327420160] [llmmodels] [INFO] [cpu_binding.py-280] : process 7934, new_affinity is [8, 9, 10, 11, 12, 13, 14, 15], cpu count 8 [2025-11-15 22:12:55,356] [7933] [139649439929600] [llmmodels] [INFO] [cpu_binding.py-254] : rank_id: 0, device_id: 0, numa_id: 0, shard_devices: [0, 1], cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] [2025-11-15 22:12:55,357] [7933] [139649439929600] [llmmodels] [INFO] [cpu_binding.py-280] : process 7933, new_affinity is [0, 1, 2, 3, 4, 5, 6, 7], cpu count 8 [2025-11-15 22:12:56,032] [7933] [139649439929600] [llmmodels] [INFO] [model_runner.py-156] : model_runner.quantize: None, model_runner.kv_quant_type: None, model_runner.fa_quant_type: None, model_runner.dtype: torch.float16 [2025-11-15 22:13:01,826] [7933] [139649439929600] [llmmodels] [INFO] [dist.py-81] : initialize_distributed has been Set [2025-11-15 22:13:01,827] [7933] [139649439929600] [llmmodels] [INFO] [model_runner.py-187] : init tokenizer done Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. [2025-11-15 22:13:02,070] [7934] [139886327420160] [llmmodels] [INFO] [dist.py-81] : initialize_distributed has been Set Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. [W InferFormat.cpp:62] Warning: Cannot create tensor with NZ format while dim < 2, tensor will be created with ND format. (function operator()) [W InferFormat.cpp:62] Warning: Cannot create tensor with NZ format while dim < 2, tensor will be created with ND format. (function operator()) [2025-11-15 22:13:08,435] [7933] [139649439929600] [llmmodels] [INFO] [flash_causal_qwen2.py-153] : >>>> qwen_QwenDecoderModel is called. [2025-11-15 22:13:08,526] [7934] [139886327420160] [llmmodels] [INFO] [flash_causal_qwen2.py-153] : >>>> qwen_QwenDecoderModel is called. [2025-11-15 22:13:16.666] [7933] [139649439929600] [llmmodels] [WARN] [operation_factory.cpp:42] OperationName: TransdataOperation not find in operation factory map [2025-11-15 22:13:16.698] [7934] [139886327420160] [llmmodels] [WARN] [operation_factory.cpp:42] OperationName: TransdataOperation not find in operation factory map [2025-11-15 22:13:22,379] [7933] [139649439929600] [llmmodels] [INFO] [model_runner.py-282] : model: FlashQwen2vlForCausalLM( (rotary_embedding): PositionRotaryEmbedding() (attn_mask): AttentionMask() (vision_tower): Qwen25VisionTransformerPretrainedModelATB( (encoder): Qwen25VLVisionEncoderATB( (layers): ModuleList( (0-31): 32 x Qwen25VLVisionLayerATB( (attn): VisionAttention( (qkv): TensorParallelColumnLinear( (linear): FastLinear() ) (proj): TensorParallelRowLinear( (linear): FastLinear() ) ) (mlp): VisionMlp( (gate_up_proj): TensorParallelColumnLinear( (linear): FastLinear() ) (down_proj): TensorParallelRowLinear( (linear): FastLinear() ) ) (norm1): BaseRMSNorm() (norm2): BaseRMSNorm() ) ) (patch_embed): FastPatchEmbed( (proj): TensorReplicatedLinear( (linear): FastLinear() ) ) (patch_merger): PatchMerger( (patch_merger_mlp_0): TensorParallelColumnLinear( (linear): FastLinear() ) (patch_merger_mlp_2): TensorParallelRowLinear( (linear): FastLinear() ) (patch_merger_ln_q): BaseRMSNorm() ) ) (rotary_pos_emb): VisionRotaryEmbedding() ) (language_model): FlashQwen2UsingMROPEForCausalLM( (rotary_embedding): PositionRotaryEmbedding() (attn_mask): AttentionMask() (transformer): FlashQwenModel( (wte): TensorEmbeddingWithoutChecking() (h): ModuleList( (0-27): 28 x FlashQwenLayer( (attn): FlashQwenAttention( (rotary_emb): PositionRotaryEmbedding() (c_attn): TensorParallelColumnLinear( (linear): FastLinear() ) (c_proj): TensorParallelRowLinear( (linear): FastLinear() ) ) (mlp): QwenMLP( (act): SiLU() (w2_w1): TensorParallelColumnLinear( (linear): FastLinear() ) (c_proj): TensorParallelRowLinear( (linear): FastLinear() ) ) (ln_1): QwenRMSNorm() (ln_2): QwenRMSNorm() ) ) (ln_f): QwenRMSNorm() ) (lm_head): TensorParallelHead( (linear): FastLinear() ) ) ) [2025-11-15 22:13:24,268] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-134] : hbm_capacity(GB): 87.5078125, init_memory(GB): 11.376015624962747 [2025-11-15 22:13:24,789] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-342] : pa_runner: PARunner(model_path=/models/Qwen2.5-VL-7B-Instruct/, input_text=请用超过500个字详细说明图片的内容,并仔细判断画面中的人物是否有吸烟动作。, max_position_embeddings=None, max_input_length=16384, max_output_length=1024, max_prefill_tokens=-1, load_tokenizer=True, enable_atb_torch=False, max_prefill_batch_size=None, max_batch_size=1, dtype=torch.float16, block_size=128, model_config=ModelConfig(num_heads=14, num_kv_heads=2, num_kv_heads_origin=4, head_size=128, k_head_size=128, v_head_size=128, num_layers=28, device=npu:0, dtype=torch.float16, soc_info=NPUSocInfo(soc_name='', soc_version=200, need_nz=True, matmul_nd_nz=False), kv_quant_type=None, fa_quant_type=None, mapping=Mapping(world_size=2, rank=0, num_nodes=1,pp_rank=0, pp_groups=[[0], [1]], micro_batch_size=1, attn_dp_groups=[[0], [1]], attn_tp_groups=[[0, 1]], attn_inner_sp_groups=[[0], [1]], attn_cp_groups=[[0], [1]], attn_o_proj_tp_groups=[[0], [1]], mlp_tp_groups=[[0, 1]], moe_ep_groups=[[0], [1]], moe_tp_groups=[[0, 1]]), cla_share_factor=1, model_type=qwen2_5_vl, enable_nz=False), max_memory=93960798208, [2025-11-15 22:13:24,794] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-122] : ---------------Begin warm_up--------------- [2025-11-15 22:13:24,794] [7933] [139649439929600] [llmmodels] [INFO] [cache.py-154] : kv cache will allocate 0.46484375GB memory [2025-11-15 22:13:24,821] [7934] [139886327420160] [llmmodels] [INFO] [cache.py-154] : kv cache will allocate 0.46484375GB memory [2025-11-15 22:13:24,827] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1139] : ------total req num: 1, infer start-------- [2025-11-15 22:13:26,002] [7934] [139886327420160] [llmmodels] [INFO] [flash_causal_qwen2.py-680] : <<<<<<<after transdata k_caches[0].shape=torch.Size([136, 16, 128, 16]) [2025-11-15 22:13:26,023] [7933] [139649439929600] [llmmodels] [INFO] [flash_causal_qwen2.py-676] : <<<<<<< ori k_caches[0].shape=torch.Size([136, 16, 128, 16]) [2025-11-15 22:13:26,023] [7933] [139649439929600] [llmmodels] [INFO] [flash_causal_qwen2.py-680] : <<<<<<<after transdata k_caches[0].shape=torch.Size([136, 16, 128, 16]) [2025-11-15 22:13:26,024] [7933] [139649439929600] [llmmodels] [INFO] [flash_causal_qwen2.py-705] : >>>>>>id of kcache is 139645634198608 id of vcache is 139645634198320 [2025-11-15 22:13:34,363] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1294] : Prefill time: 9476.590633392334ms, Prefill average time: 9476.590633392334ms, Decode token time: 54.94809150695801ms, E2E time: 9531.538724899292ms [2025-11-15 22:13:34,363] [7934] [139886327420160] [llmmodels] [INFO] [generate.py-1294] : Prefill time: 9452.020645141602ms, Prefill average time: 9452.020645141602ms, Decode token time: 54.654598236083984ms, E2E time: 9506.675243377686ms [2025-11-15 22:13:34,366] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1326] : -------------------performance dumped------------------------ [2025-11-15 22:13:34,371] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1329] : | batch_size | input_seq_len | output_seq_len | e2e_time(ms) | prefill_time(ms) | decoder_token_time(ms) | prefill_count | prefill_average_time(ms) | |-------------:|----------------:|-----------------:|---------------:|-------------------:|-------------------------:|----------------:|---------------------------:| | 1 | 16384 | 2 | 9531.54 | 9476.59 | 54.95 | 1 | 9476.59 | /usr/local/lib64/python3.11/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True). warnings.warn( [2025-11-15 22:13:35,307] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-148] : warmup_memory(GB): 15.75 [2025-11-15 22:13:35,307] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-153] : ---------------End warm_up--------------- /usr/local/lib64/python3.11/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True). warnings.warn( [2025-11-15 22:13:35,363] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1139] : ------total req num: 1, infer start-------- [2025-11-15 22:13:50,021] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1294] : Prefill time: 1004.0028095245361ms, Prefill average time: 1004.0028095245361ms, Decode token time: 13.301290491575836ms, E2E time: 14611.222982406616ms [2025-11-15 22:13:50,021] [7934] [139886327420160] [llmmodels] [INFO] [generate.py-1294] : Prefill time: 1067.9974555969238ms, Prefill average time: 1067.9974555969238ms, Decode token time: 13.300292536193908ms, E2E time: 14674.196720123291ms [2025-11-15 22:13:50,025] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1326] : -------------------performance dumped------------------------ [2025-11-15 22:13:50,028] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1329] : | batch_size | input_seq_len | output_seq_len | e2e_time(ms) | prefill_time(ms) | decoder_token_time(ms) | prefill_count | prefill_average_time(ms) | |-------------:|----------------:|-----------------:|---------------:|-------------------:|-------------------------:|----------------:|---------------------------:| | 1 | 1675 | 1024 | 14611.2 | 1004 | 13.3 | 1 | 1004 | [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-385] : Question[0]: [{'image': '/root/pic/test.jpg'}, {'text': '请用超过500个字详细说明图片的内容,并仔细判断画面中的人物是否有吸烟动作。'}] [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-386] : Answer[0]: 这张图片展示了一个无人机航拍的场景,画面中可以看到两名工人站在一个雪地或冰面上。他们穿着橙色的安全背心和红色的安全帽,显得非常醒目。背景中可以看到一些雪地和一些金属结构,可能是桥梁或工业设施的一部分。 从图片的细节来看,画面右侧的工人右手放在嘴边,似乎在吸烟。他的姿势和动作与吸烟者的典型姿势相符。然而,由于图片的分辨率和角度限制,无法完全确定这个动作是否真实发生。如果要准确判断,可能需要更多的视频片段或更清晰的图像。 从无人机航拍的角度来看,这个场景可能是在进行某种工业或建筑项目的检查或监控。两名工人可能正在进行现场检查或讨论工作事宜。雪地和金属结构表明这可能是一个寒冷的冬季,或者是一个寒冷的气候区域。 无人机航拍技术在工业和建筑领域中非常常见,因为它可以提供高空视角,帮助工程师和管理人员更好地了解现场情况。这种技术不仅可以节省时间和成本,还可以提高工作效率和安全性。在进行航拍时,确保遵守当地的法律法规和安全规定是非常重要的。 总的来说,这张图片展示了一个无人机航拍的场景,画面中两名工人站在雪地上,其中一人似乎在吸烟。虽然无法完全确定这个动作是否真实发生,但根据他们的姿势和动作,可以合理推测这个动作的存在。 [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-387] : Generate[0] token num: 282 [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-389] : Latency(s): 14.721353530883789 [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-390] : Throughput(tokens/s): 19.15584728050956 本文详细介绍了在OrangePi AI Studio上使用Docker容器部署MindIE环境并运行Qwen2.5-VL-7B-Instruct多模态大模型实现吸烟动作识别的完整过程,验证了在Ascned 310p设备上运行多模态理解大模型的可靠性。
  • [分享交流] 下载MindSpore
     大家好,我是初学者,我在下载MindSpore中遇到了这个问题,想请教一下大家,我下一步该怎么做,特别感谢前辈们能抽出宝贵的时间回答我的问题。 
  • [技术干货] OrangePi AI Studio Pro基于MindYolo实现YOLOv8模型训练及验证
    OrangePi AI Studio Pro基于MindYolo实现YOLOv8模型训练及验证OrangePi AI Studio Pro是基于 2 个昇腾 310P 处理器的新一代高性能推理解析卡,提供基础通用算力+超强AI算力,整合了训练和推理的全部底层软件栈,实现训推一体。其中AI半精度FP16算力约为176TFLOPS,整数Int8精度可达352TOPS。本章将介绍如何在昇腾310上基于mindyolo实现YOLOv8模型的训练及验证。一、环境准备首先检查昇腾310P的NPU驱动,在命令行中输入:npu-smi info,可以看到两块昇腾310P的AICore的利用率和内存的占用情况。+--------------------------------------------------------------------------------------------------------+ | npu-smi v1.0 Version: 24.1.rc4.b999 | +-------------------------------+-----------------+------------------------------------------------------+ | NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page) | | Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) | +===============================+=================+======================================================+ | 30208 310P1 | OK | NA 41 0 / 0 | | 0 0 | 0000:77:00.0 | 0 1416 / 89608 | +-------------------------------+-----------------+------------------------------------------------------+ | 30208 310P1 | OK | NA 40 0 / 0 | | 1 1 | 0000:77:00.0 | 0 1622 / 89085 | +===============================+=================+======================================================+ +-------------------------------+-----------------+------------------------------------------------------+ | NPU Chip | Process id | Process name | Process memory(MB) | +===============================+=================+======================================================+ | No running processes found in NPU 30208 | +===============================+=================+======================================================+之后升级CANN的版本以及更新MindSpore,可以参考我的另一篇文章:如何在OrangePi Studio Pro上升级CANN以及的Pytorch和MindSpore,升级完成后,检查MindSpore的安装情况,我使用的版本是2.7.0。source /usr/local/Ascend/ascend-toolkit/set_env.sh python3 -c "import mindspore;mindspore.set_context(device_target='Ascend');mindspore.run_check()" [WARNING] ME(1621400:139701939115840,MainProcess):2025-09-24-10:46:21.978.000 [mindspore/context.py:1412] For 'context.set_context', the parameter 'device_target' will be deprecated and removed in a future version. Please use the api mindspore.set_device() instead. MindSpore version: 2.7.0 [WARNING] GE_ADPT(1621400,7f0e18710640,python3):2025-09-24-10:46:23.323.570 [mindspore/ops/kernel/ascend/acl_ir/op_api_exec.cc:169] GetAscendDefaultCustomPath] Checking whether the so exists or if permission to access it is available: /usr/local/Ascend/ascend-toolkit/latest/opp/vendors/customize_vision/op_api/lib/libcust_opapi.so The result of multiplication calculation is correct, MindSpore has been installed on platform [Ascend] successfully! 克隆mindyolo仓库,我们使用由天津大学发布的无人机视觉挑战赛数据集VisDrone-Dataset进行模型的训练及验证。git clone https://github.com/mindspore-lab/mindyolo.git正克隆到 'mindyolo'... remote: Enumerating objects: 3505, done. remote: Counting objects: 100% (157/157), done. remote: Compressing objects: 100% (69/69), done. remote: Total 3505 (delta 114), reused 88 (delta 88), pack-reused 3348 (from 2) 接收对象中: 100% (3505/3505), 6.74 MiB | 8.91 MiB/s, 完成. 处理 delta 中: 100% (2048/2048), 完成.我们将下载后的数据集首先转换成YOLO格式,具体的转换教程可以参考网上的公开资料,经过转换后的visdrone数据集包括以下内容:visdrone ├── train │ ├── images │ │ ├── 000001.jpg │ │ ├── 000002.jpg │ │ ├── ... │ │ └── ... │ └── labels │ ├── 000001.txt │ ├── 000002.txt │ ├── ... │ └── ... └── val ├── images │ ├── 000001.jpg │ ├── 000002.jpg │ ├── ... │ └── ... └── labels ├── 000001.txt ├── 000001.txt ├── ... └── ... 二、数据格式转换由于mindyolo中的train过程使用的数据是yolo格式,而eval过程使用coco数据集中的json文件,因此需要再增加coco格式的标注文件instances_train2017.json、instances_val2017.json以及train.txt和val.txt文件,经过转换后的visdrone数据集包括以下内容:visdrone_COCO_format ├── train.txt ├── val.txt ├── train │ ├── images │ │ ├── 000001.jpg │ │ ├── 000002.jpg │ │ ├── ... │ │ └── ... │ └── labels │ ├── 000001.txt │ ├── 000002.txt │ ├── ... │ └── ... ├── annotations │ ├── instances_train2017.json │ └── instances_val2017.json └── val ├── images │ ├── 000001.jpg │ ├── 000002.jpg │ ├── ... │ └── ... └── labels ├── 000001.txt ├── 000001.txt ├── ... └── ... 我们先把YOLO格式的数据集转换为COCO格式,在mindyolo中实现yolov5_yaml_to_coco.py脚本,具体代码如下:# -*- encoding: utf-8 -*- # @Author: SWHL # @Contact: liekkaskono@163.com import argparse import glob import json import os import shutil import time from pathlib import Path import cv2 import yaml from tqdm import tqdm def read_txt(txt_path): with open(str(txt_path), "r", encoding="utf-8") as f: data = list(map(lambda x: x.rstrip("\n"), f)) return data def mkdir(dir_path): Path(dir_path).mkdir(parents=True, exist_ok=True) def verify_exists(file_path): file_path = Path(file_path).resolve() if not file_path.exists(): raise FileNotFoundError(f"The {file_path} is not exists!!!") class YOLOV5CFG2COCO: def __init__(self, yaml_path): verify_exists(yaml_path) with open(yaml_path, "r", encoding="UTF-8") as f: self.data_cfg = yaml.safe_load(f) self.root_dir = Path(yaml_path).parent.parent self.root_data_dir = Path(self.data_cfg.get("path")) self.train_path = self._get_data_dir("train") self.val_path = self._get_data_dir("val") nc = self.data_cfg["nc"] if "names" in self.data_cfg: self.names = self.data_cfg.get("names") else: # assign class names if missing self.names = [f"class{i}" for i in range(self.data_cfg["nc"])] assert ( len(self.names) == nc ), f"{len(self.names)} names found for nc={nc} dataset in {yaml_path}" # 构建COCO格式目录 self.dst = self.root_dir / f"{Path(self.root_data_dir).stem}_COCO_format" self.coco_train = "train/images" self.coco_val = "val/images" self.coco_annotation = "annotations" self.coco_train_json = ( self.dst / self.coco_annotation / f"instances_train2017.json" ) self.coco_val_json = ( self.dst / self.coco_annotation / f"instances_val2017.json" ) mkdir(self.dst) mkdir(self.dst / self.coco_train) mkdir(self.dst / self.coco_val) mkdir(self.dst / self.coco_annotation) # 构建json内容结构 self.type = "instances" self.categories = [] self._get_category() self.annotation_id = 1 cur_year = time.strftime("%Y", time.localtime(time.time())) self.info = { "year": int(cur_year), "version": "1.0", "description": "For object detection", "date_created": cur_year, } self.licenses = [ { "id": 1, "name": "Apache License v2.0", "url": "https://choosealicense.com/licenses/apache-2.0/", } ] def _get_data_dir(self, mode): data_dir = self.data_cfg.get(mode) if data_dir: if isinstance(data_dir, str): full_path = [str(self.root_data_dir / data_dir)] elif isinstance(data_dir, list): full_path = [str(self.root_data_dir / one_dir) for one_dir in data_dir] else: raise TypeError(f"{data_dir} is not str or list.") else: raise ValueError(f"{mode} dir is not in the yaml.") return full_path def _get_category(self): for i, category in enumerate(self.names, start=1): self.categories.append( { "supercategory": category, "id": i, "name": category, } ) def generate(self): self.train_files = self.get_files(self.train_path) self.valid_files = self.get_files(self.val_path) train_dest_dir = Path(self.dst) / self.coco_train self.gen_dataset( self.train_files, train_dest_dir, self.coco_train_json, mode="train" ) val_dest_dir = Path(self.dst) / self.coco_val self.gen_dataset(self.valid_files, val_dest_dir, self.coco_val_json, mode="val") print(f"The output directory is: {self.dst}") def get_files(self, path): IMG_FORMATS = ["bmp", "dng", "jpeg", "jpg", "mpo", "png", "tif", "tiff", "webp"] f = [] for p in path: p = Path(p) if p.is_dir(): f += glob.glob(str(p / "**" / "*.*"), recursive=True) elif p.is_file(): # file with open(p, "r", encoding="utf-8") as t: t = t.read().strip().splitlines() parent = str(p.parent) + os.sep f += [ x.replace("./", parent) if x.startswith("./") else x for x in t ] else: raise FileExistsError(f"{p} does not exist") im_files = sorted( x.replace("/", os.sep) for x in f if x.split(".")[-1].lower() in IMG_FORMATS ) return im_files def gen_dataset(self, img_paths, target_img_path, target_json, mode): """ https://cocodataset.org/#format-data """ images = [] annotations = [] sa, sb = ( os.sep + "images" + os.sep, os.sep + "labels" + os.sep, ) # /images/, /labels/ substrings for img_id, img_path in enumerate(tqdm(img_paths, desc=mode), 1): label_path = sb.join(img_path.rsplit(sa, 1)).rsplit(".", 1)[0] + ".txt" img_path = Path(img_path) verify_exists(img_path) imgsrc = cv2.imread(str(img_path)) height, width = imgsrc.shape[:2] dest_file_name = f"{img_id:012d}.jpg" save_img_path = target_img_path / dest_file_name if img_path.suffix.lower() == ".jpg": shutil.copyfile(img_path, save_img_path) else: cv2.imwrite(str(save_img_path), imgsrc) images.append( { "date_captured": "2021", "file_name": dest_file_name, "id": img_id, "height": height, "width": width, } ) if Path(label_path).exists(): new_anno = self.read_annotation(label_path, img_id, height, width) if len(new_anno) > 0: annotations.extend(new_anno) else: raise ValueError(f"{label_path} is empty") else: raise FileNotFoundError(f"{label_path} not exists") json_data = { "info": self.info, "images": images, "licenses": self.licenses, "type": self.type, "annotations": annotations, "categories": self.categories, } with open(target_json, "w", encoding="utf-8") as f: json.dump(json_data, f, ensure_ascii=False) def read_annotation(self, txt_file, img_id, height, width): annotation = [] all_info = read_txt(txt_file) for label_info in all_info: # 遍历一张图中不同标注对象 label_info = label_info.split(" ") if len(label_info) < 5: continue category_id, vertex_info = label_info[0], label_info[1:] segmentation, bbox, area = self._get_annotation(vertex_info, height, width) annotation.append( { "segmentation": segmentation, "area": area, "iscrowd": 0, "image_id": img_id, "bbox": bbox, "category_id": int(category_id) + 1, "id": self.annotation_id, } ) self.annotation_id += 1 return annotation @staticmethod def _get_annotation(vertex_info, height, width): cx, cy, w, h = [float(i) for i in vertex_info] cx = cx * width cy = cy * height box_w = w * width box_h = h * height x0 = max(cx - box_w / 2, 0) y0 = max(cy - box_h / 2, 0) x1 = min(x0 + box_w, width) y1 = min(y0 + box_h, height) segmentation = [[x0, y0, x1, y0, x1, y1, x0, y1]] bbox = [x0, y0, box_w, box_h] area = box_w * box_h return segmentation, bbox, area def main(): parser = argparse.ArgumentParser("Datasets converter from YOLOV5 to COCO") parser.add_argument( "--yaml_path", type=str, default="dataset/YOLOV5_yaml/sample.yaml", help="Dataset cfg file", ) args = parser.parse_args() converter = YOLOV5CFG2COCO(args.yaml_path) converter.generate() if __name__ == "__main__": main() 之后在mindyolo目录下创建YOLO格式的配置文件visdrone.yaml:# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..] path: /root/workspace/dataset/visdrone # dataset root dir (absolute path) train: train/images # train images (relative to 'path') val: val/images # val images (relative to 'path') test: # test images (optional) nc: 12 # Classes,类别 names: 0: ignored regions 1: pedestrian 2: people 3: bicycle 4: car 5: van 6: truck 7: tricycle 8: awning-tricycle 9: bus 10: motor 11: others在终端中运行如下命令将YOLO格式的数据集转换为COCO格式:python3 yolov5_yaml_to_coco.py --yaml_path visdrone.yamltrain: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6471/6471 [01:13<00:00, 88.07it/s] val: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 548/548 [00:03<00:00, 148.22it/s] The output directory is: visdrone_COCO_format再创建coco2yolo.py的Python脚本将COCO格式的标注文件.json导出为labels文件夹中YOLO格式的标注文件.txt:import json import os import argparse parser = argparse.ArgumentParser(description='Test yolo data.') parser.add_argument('-j', help='JSON file', dest='json', required=True) parser.add_argument('-o', help='path to output folder', dest='out',required=True) args = parser.parse_args() json_file = args.json output = args.out class COCO2YOLO: def __init__(self): self._check_file_and_dir(json_file, output) self.labels = json.load(open(json_file, 'r', encoding='utf-8')) self.coco_id_name_map = self._categories() self.coco_name_list = list(self.coco_id_name_map.values()) print("total images", len(self.labels['images'])) print("total categories", len(self.labels['categories'])) print("total labels", len(self.labels['annotations'])) def _check_file_and_dir(self, file_path, dir_path): if not os.path.exists(file_path): raise ValueError("file not found") if not os.path.exists(dir_path): os.makedirs(dir_path) def _categories(self): categories = {} for cls in self.labels['categories']: categories[cls['id']] = cls['name'] return categories def _load_images_info(self): images_info = {} for image in self.labels['images']: id = image['id'] file_name = image['file_name'] if file_name.find('\\') > -1: file_name = file_name[file_name.index('\\')+1:] w = image['width'] h = image['height'] images_info[id] = (file_name, w, h) return images_info def _bbox_2_yolo(self, bbox, img_w, img_h): x, y, w, h = bbox[0], bbox[1], bbox[2], bbox[3] centerx = bbox[0] + w / 2 centery = bbox[1] + h / 2 dw = 1 / img_w dh = 1 / img_h centerx *= dw w *= dw centery *= dh h *= dh return centerx, centery, w, h def _convert_anno(self, images_info): anno_dict = dict() for anno in self.labels['annotations']: bbox = anno['bbox'] image_id = anno['image_id'] category_id = anno['category_id'] image_info = images_info.get(image_id) image_name = image_info[0] img_w = image_info[1] img_h = image_info[2] yolo_box = self._bbox_2_yolo(bbox, img_w, img_h) anno_info = (image_name, category_id, yolo_box) anno_infos = anno_dict.get(image_id) if not anno_infos: anno_dict[image_id] = [anno_info] else: anno_infos.append(anno_info) anno_dict[image_id] = anno_infos return anno_dict def save_classes(self): sorted_classes = list(map(lambda x: x['name'], sorted(self.labels['categories'], key=lambda x: x['id']))) print('coco names', sorted_classes) with open('coco.names', 'w', encoding='utf-8') as f: for cls in sorted_classes: f.write(cls + '\n') f.close() def coco2yolo(self): print("loading image info...") images_info = self._load_images_info() print("loading done, total images", len(images_info)) print("start converting...") anno_dict = self._convert_anno(images_info) print("converting done, total labels", len(anno_dict)) print("saving txt file...") self._save_txt(anno_dict) print("saving done") def _save_txt(self, anno_dict): for k, v in anno_dict.items(): file_name = os.path.splitext(v[0][0])[0] + ".txt" with open(os.path.join(output, file_name), 'w', encoding='utf-8') as f: print(k, v) for obj in v: cat_name = self.coco_id_name_map.get(obj[1]) category_id = self.coco_name_list.index(cat_name) box = ['{:.6f}'.format(x) for x in obj[2]] box = ' '.join(box) line = str(category_id) + ' ' + box f.write(line + '\n') if __name__ == '__main__': c2y = COCO2YOLO() c2y.coco2yolo() 在终端中切换到mindyolo目录下依次运行如下命令导出instances_train2017.json和instances_val2017.json文件对应的YOLO格式的标注文件到labels文件夹中:python3 coco2yolo.py -j ./visdrone_COCO_format/annotations/instances_train2017.json -o ./visdrone_COCO_format/train/labelspython3 coco2yolo.py -j ./visdrone_COCO_format/annotations/instances_val2017.json -o ./visdrone_COCO_format/val/labels最后创建generate_txt.sh脚本在COCO数据集目录下生成train.txt和val.txt,指定训练图片和验证图片的在数据集中的相对路径:#!/bin/bash # 检查是否提供了数据集路径参数 if [ $# -eq 0 ]; then echo "Usage: $0 <dataset_path>" echo "Example: $0 /path/to/visdrone" exit 1 fi # 获取数据集路径 DATASET_PATH="$1" # 检查数据集路径是否存在 if [ ! -d "$DATASET_PATH" ]; then echo "Error: Dataset path '$DATASET_PATH' does not exist." exit 1 fi # 定义训练和验证图片目录 TRAIN_DIR="$DATASET_PATH/train/images" VAL_DIR="$DATASET_PATH/val/images" # 检查训练和验证目录是否存在 if [ ! -d "$TRAIN_DIR" ]; then echo "Error: Train directory '$TRAIN_DIR' does not exist." exit 1 fi if [ ! -d "$VAL_DIR" ]; then echo "Error: Validation directory '$VAL_DIR' does not exist." exit 1 fi # 生成 train.txt TRAIN_TXT="$DATASET_PATH/train.txt" ls "$TRAIN_DIR" | grep '\.jpg$' | sort | sed 's/^/\.\/train\/images\//' > "$TRAIN_TXT" echo "Generated $TRAIN_TXT" # 生成 val.txt VAL_TXT="$DATASET_PATH/val.txt" ls "$VAL_DIR" | grep '\.jpg$' | sort | sed 's/^/\.\/val\/images\//' > "$VAL_TXT" echo "Generated $VAL_TXT" echo "Successfully generated train.txt and val.txt in $DATASET_PATH" 在终端中运行generate_txt.sh,并传入前面COCO数据集的路径:chmod +x generate_txt.sh ./generate_txt.sh visdrone_COCO_formatGenerated visdrone_COCO_format/train.txt Generated visdrone_COCO_format/val.txt Successfully generated train.txt and val.txt in visdrone_COCO_format最终生成的visdrone_COCO_format数据集的格式如下,可以直接用于MindYOLOv8模型的训练:visdrone_COCO_format ├── train.txt ├── val.txt ├── train │ ├── images │ │ ├── 000001.jpg │ │ ├── 000002.jpg │ │ ├── ... │ │ └── ... │ └── labels │ ├── 000001.txt │ ├── 000002.txt │ ├── ... │ └── ... ├── annotations │ ├── instances_train2017.json │ └── instances_val2017.json └── val ├── images │ ├── 000001.jpg │ ├── 000002.jpg │ ├── ... │ └── ... └── labels ├── 000001.txt ├── 000001.txt ├── ... └── ... 三、模型训练MindYOLO支持yaml文件继承机制,因此新编写的配置文件只需要继承MindYOLO提供的原生yaml文件现有配置文件:在configs目录下编写MindYOLO数据集的yaml配置文件,指定训练图片和验证图片的路径以及模型的类别标签:data: dataset_name: visdrone_COCO_format train_set: /root/workspace/mindyolo/visdrone_COCO_format/train.txt val_set: /root/workspace/mindyolo/visdrone_COCO_format/val.txt test_set: /root/workspace/mindyolo/visdrone_COCO_format/val.txt nc: 12 # class names names: ['ignored regions', 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor', 'others' ] train_transforms: [] test_transforms: [] 修改configs/yolov8s.yaml文件,注释掉原有的coco.yaml配置文件,指定我们自己的数据集,同时添加epochs、img_size、per_batch_size、multi-stage data augment等自定义训练参数:__BASE__: [ # '../coco.yaml', '../visdrone.yaml', './hyp.scratch.low.yaml', './yolov8-base.yaml' ] overflow_still_update: False network: depth_multiple: 0.33 # scales module repeats width_multiple: 0.50 # scales convolution channels max_channels: 1024 epochs: 10 img_size: 1024 per_batch_size: 16 data: num_parallel_workers: 8 # multi-stage data augment train_transforms: { stage_epochs: [ 5, 5 ], trans_list: [ [ { func_name: mosaic, prob: 1.0 }, { func_name: resample_segments }, { func_name: random_perspective, prob: 1.0, degrees: 0.0, translate: 0.1, scale: 0.5, shear: 0.0 }, {func_name: albumentations}, {func_name: hsv_augment, prob: 1.0, hgain: 0.015, sgain: 0.7, vgain: 0.4}, {func_name: fliplr, prob: 0.5}, {func_name: label_norm, xyxy2xywh_: True}, {func_name: label_pad, padding_size: 160, padding_value: -1}, {func_name: image_norm, scale: 255.}, {func_name: image_transpose, bgr2rgb: True, hwc2chw: True} ], [ {func_name: letterbox, scaleup: True}, {func_name: resample_segments}, {func_name: random_perspective, prob: 1.0, degrees: 0.0, translate: 0.1, scale: 0.5, shear: 0.0}, {func_name: albumentations}, {func_name: hsv_augment, prob: 1.0, hgain: 0.015, sgain: 0.7, vgain: 0.4}, {func_name: fliplr, prob: 0.5}, {func_name: label_norm, xyxy2xywh_: True}, {func_name: label_pad, padding_size: 160, padding_value: -1}, {func_name: image_norm, scale: 255.}, {func_name: image_transpose, bgr2rgb: True, hwc2chw: True} ]] } test_transforms: [ {func_name: letterbox, scaleup: False, only_image: True}, {func_name: image_norm, scale: 255.}, {func_name: image_transpose, bgr2rgb: True, hwc2chw: True} ] 在终端中运行train.py进行模型训练,指定模型的配置文件以及使用昇腾NPU:python3 train.py --config ./configs/yolov8/yolov8s.yaml --device_target Ascend默认是跑在0卡上也可以在环境变量中指定DEVICE_ID让模型的训练代码跑在1卡上:import os os.setenv("DEVICE_ID", 1) 如果不想设置环境变量也可以修改mindyolo\mindyolo\utils\utils.py中默认的参数:import os import random import yaml import cv2 from datetime import datetime import numpy as np import mindspore as ms from mindspore import ops, Tensor, nn from mindspore.communication.management import get_group_size, get_rank, init from mindspore import ParallelMode from mindyolo.utils import logger def set_seed(seed=2): np.random.seed(seed) random.seed(seed) ms.set_seed(seed) def set_default(args): # Set Context ms.set_context(mode=args.ms_mode) ms.set_recursion_limit(args.max_call_depth) if args.ms_mode == 0: ms.set_context(jit_config={"jit_level": "O2"}) if args.device_target == "Ascend": ms.set_device("Ascend", int(os.getenv("DEVICE_ID", 1))) ... 2025-10-23 14:48:02,364 [INFO] parse_args: 2025-10-23 14:48:02,364 [INFO] task detect 2025-10-23 14:48:02,364 [INFO] device_target Ascend 2025-10-23 14:48:02,364 [INFO] save_dir ./runs/2025.10.23-14.48.02 2025-10-23 14:48:02,364 [INFO] log_level INFO 2025-10-23 14:48:02,364 [INFO] is_parallel False 2025-10-23 14:48:02,364 [INFO] ms_mode 0 2025-10-23 14:48:02,364 [INFO] max_call_depth 2000 2025-10-23 14:48:02,364 [INFO] ms_amp_level O0 2025-10-23 14:48:02,364 [INFO] keep_loss_fp32 True 2025-10-23 14:48:02,364 [INFO] anchor_base False 2025-10-23 14:48:02,364 [INFO] ms_loss_scaler static 2025-10-23 14:48:02,364 [INFO] ms_loss_scaler_value 1024.0 2025-10-23 14:48:02,364 [INFO] ms_jit True 2025-10-23 14:48:02,364 [INFO] ms_enable_graph_kernel False 2025-10-23 14:48:02,364 [INFO] ms_datasink False 2025-10-23 14:48:02,364 [INFO] overflow_still_update False 2025-10-23 14:48:02,364 [INFO] clip_grad False 2025-10-23 14:48:02,364 [INFO] clip_grad_value 10.0 2025-10-23 14:48:02,364 [INFO] ema True 2025-10-23 14:48:02,364 [INFO] weight 2025-10-23 14:48:02,364 [INFO] ema_weight 2025-10-23 14:48:02,364 [INFO] freeze [] 2025-10-23 14:48:02,364 [INFO] epochs 10 2025-10-23 14:48:02,364 [INFO] per_batch_size 16 2025-10-23 14:48:02,364 [INFO] img_size 1024 2025-10-23 14:48:02,364 [INFO] nbs 64 2025-10-23 14:48:02,364 [INFO] accumulate 1 2025-10-23 14:48:02,364 [INFO] auto_accumulate False 2025-10-23 14:48:02,364 [INFO] log_interval 100 2025-10-23 14:48:02,364 [INFO] single_cls False 2025-10-23 14:48:02,364 [INFO] sync_bn False 2025-10-23 14:48:02,364 [INFO] keep_checkpoint_max 100 2025-10-23 14:48:02,364 [INFO] run_eval False 2025-10-23 14:48:02,364 [INFO] run_eval_interval 1 2025-10-23 14:48:02,364 [INFO] conf_thres 0.001 2025-10-23 14:48:02,364 [INFO] iou_thres 0.7 2025-10-23 14:48:02,364 [INFO] conf_free True 2025-10-23 14:48:02,364 [INFO] rect False 2025-10-23 14:48:02,364 [INFO] nms_time_limit 20.0 2025-10-23 14:48:02,364 [INFO] recompute False 2025-10-23 14:48:02,364 [INFO] recompute_layers 0 2025-10-23 14:48:02,364 [INFO] seed 2 2025-10-23 14:48:02,364 [INFO] summary True 2025-10-23 14:48:02,364 [INFO] profiler False 2025-10-23 14:48:02,364 [INFO] profiler_step_num 1 2025-10-23 14:48:02,364 [INFO] opencv_threads_num 0 2025-10-23 14:48:02,364 [INFO] strict_load True 2025-10-23 14:48:02,364 [INFO] enable_modelarts False 2025-10-23 14:48:02,364 [INFO] data_url 2025-10-23 14:48:02,364 [INFO] ckpt_url 2025-10-23 14:48:02,364 [INFO] multi_data_url 2025-10-23 14:48:02,364 [INFO] pretrain_url 2025-10-23 14:48:02,364 [INFO] train_url 2025-10-23 14:48:02,364 [INFO] data_dir /cache/data/ 2025-10-23 14:48:02,364 [INFO] ckpt_dir /cache/pretrain_ckpt/ 2025-10-23 14:48:02,364 [INFO] data.dataset_name result 2025-10-23 14:48:02,364 [INFO] data.train_set /root/workspace/mindyolo/visdrone_COCO_format/train.txt 2025-10-23 14:48:02,364 [INFO] data.val_set /root/workspace/mindyolo/visdrone_COCO_format/val.txt 2025-10-23 14:48:02,364 [INFO] data.test_set /root/workspace/mindyolo/visdrone_COCO_format/val.txt 2025-10-23 14:48:02,364 [INFO] data.nc 12 2025-10-23 14:48:02,364 [INFO] data.names ['ignored regions', 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor', 'others'] 2025-10-23 14:48:02,364 [INFO] train_transforms.stage_epochs [5, 5] 2025-10-23 14:48:02,364 [INFO] train_transforms.trans_list [[{'func_name': 'mosaic', 'prob': 1.0}, {'func_name': 'resample_segments'}, {'func_name': 'random_perspective', 'prob': 1.0, 'degrees': 0.0, 'translate': 0.1, 'scale': 0.5, 'shear': 0.0}, {'func_name': 'albumentations'}, {'func_name': 'hsv_augment', 'prob': 1.0, 'hgain': 0.015, 'sgain': 0.7, 'vgain': 0.4}, {'func_name': 'fliplr', 'prob': 0.5}, {'func_name': 'label_norm', 'xyxy2xywh_': True}, {'func_name': 'label_pad', 'padding_size': 160, 'padding_value': -1}, {'func_name': 'image_norm', 'scale': 255.0}, {'func_name': 'image_transpose', 'bgr2rgb': True, 'hwc2chw': True}], [{'func_name': 'letterbox', 'scaleup': True}, {'func_name': 'resample_segments'}, {'func_name': 'random_perspective', 'prob': 1.0, 'degrees': 0.0, 'translate': 0.1, 'scale': 0.5, 'shear': 0.0}, {'func_name': 'albumentations'}, {'func_name': 'hsv_augment', 'prob': 1.0, 'hgain': 0.015, 'sgain': 0.7, 'vgain': 0.4}, {'func_name': 'fliplr', 'prob': 0.5}, {'func_name': 'label_norm', 'xyxy2xywh_': True}, {'func_name': 'label_pad', 'padding_size': 160, 'padding_value': -1}, {'func_name': 'image_norm', 'scale': 255.0}, {'func_name': 'image_transpose', 'bgr2rgb': True, 'hwc2chw': True}]] 2025-10-23 14:48:02,364 [INFO] data.test_transforms [{'func_name': 'letterbox', 'scaleup': False, 'only_image': True}, {'func_name': 'image_norm', 'scale': 255.0}, {'func_name': 'image_transpose', 'bgr2rgb': True, 'hwc2chw': True}] 2025-10-23 14:48:02,364 [INFO] data.num_parallel_workers 8 2025-10-23 14:48:02,364 [INFO] optimizer.optimizer momentum 2025-10-23 14:48:02,364 [INFO] optimizer.lr_init 0.01 2025-10-23 14:48:02,364 [INFO] optimizer.momentum 0.937 2025-10-23 14:48:02,364 [INFO] optimizer.nesterov True 2025-10-23 14:48:02,364 [INFO] optimizer.loss_scale 1.0 2025-10-23 14:48:02,364 [INFO] optimizer.warmup_epochs 3 2025-10-23 14:48:02,364 [INFO] optimizer.warmup_momentum 0.8 2025-10-23 14:48:02,364 [INFO] optimizer.warmup_bias_lr 0.1 2025-10-23 14:48:02,364 [INFO] optimizer.min_warmup_step 1000 2025-10-23 14:48:02,364 [INFO] optimizer.group_param yolov8 2025-10-23 14:48:02,364 [INFO] optimizer.gp_weight_decay 0.0005 2025-10-23 14:48:02,364 [INFO] optimizer.start_factor 1.0 2025-10-23 14:48:02,364 [INFO] optimizer.end_factor 0.01 2025-10-23 14:48:02,364 [INFO] optimizer.epochs 10 2025-10-23 14:48:02,364 [INFO] optimizer.nbs 64 2025-10-23 14:48:02,364 [INFO] optimizer.accumulate 1 2025-10-23 14:48:02,364 [INFO] optimizer.total_batch_size 16 2025-10-23 14:48:02,364 [INFO] loss.name YOLOv8Loss 2025-10-23 14:48:02,364 [INFO] loss.box 7.5 2025-10-23 14:48:02,364 [INFO] loss.cls 0.5 2025-10-23 14:48:02,364 [INFO] loss.dfl 1.5 2025-10-23 14:48:02,364 [INFO] loss.reg_max 16 2025-10-23 14:48:02,364 [INFO] network.model_name yolov8 2025-10-23 14:48:02,364 [INFO] network.nc 80 2025-10-23 14:48:02,364 [INFO] network.reg_max 16 2025-10-23 14:48:02,364 [INFO] network.stride [8, 16, 32] 2025-10-23 14:48:02,364 [INFO] network.backbone [[-1, 1, 'ConvNormAct', [64, 3, 2]], [-1, 1, 'ConvNormAct', [128, 3, 2]], [-1, 3, 'C2f', [128, True]], [-1, 1, 'ConvNormAct', [256, 3, 2]], [-1, 6, 'C2f', [256, True]], [-1, 1, 'ConvNormAct', [512, 3, 2]], [-1, 6, 'C2f', [512, True]], [-1, 1, 'ConvNormAct', [1024, 3, 2]], [-1, 3, 'C2f', [1024, True]], [-1, 1, 'SPPF', [1024, 5]]] 2025-10-23 14:48:02,364 [INFO] network.head [[-1, 1, 'Upsample', ['None', 2, 'nearest']], [[-1, 6], 1, 'Concat', [1]], [-1, 3, 'C2f', [512]], [-1, 1, 'Upsample', ['None', 2, 'nearest']], [[-1, 4], 1, 'Concat', [1]], [-1, 3, 'C2f', [256]], [-1, 1, 'ConvNormAct', [256, 3, 2]], [[-1, 12], 1, 'Concat', [1]], [-1, 3, 'C2f', [512]], [-1, 1, 'ConvNormAct', [512, 3, 2]], [[-1, 9], 1, 'Concat', [1]], [-1, 3, 'C2f', [1024]], [[15, 18, 21], 1, 'YOLOv8Head', ['nc', 'reg_max', 'stride']]] 2025-10-23 14:48:02,364 [INFO] network.depth_multiple 0.33 2025-10-23 14:48:02,364 [INFO] network.width_multiple 0.5 2025-10-23 14:48:02,364 [INFO] network.max_channels 1024 2025-10-23 14:48:02,364 [INFO] config ./configs/yolov8/yolov8s.yaml 2025-10-23 14:48:02,364 [INFO] rank 0 2025-10-23 14:48:02,364 [INFO] rank_size 1 2025-10-23 14:48:02,364 [INFO] total_batch_size 16 2025-10-23 14:48:02,364 [INFO] callback [] 2025-10-23 14:48:02,364 [INFO] 2025-10-23 14:48:02,365 [INFO] Please check the above information for the configurations 2025-10-23 14:48:02,441 [WARNING] Parse Model, args: nearest, keep str type 2025-10-23 14:48:02,451 [WARNING] Parse Model, args: nearest, keep str type 2025-10-23 14:48:02,528 [INFO] number of network params, total: 11.160279M, trainable: 11.140228M [WARNING] GE_ADPT(336686,7ff4350e8740,python3):2025-10-23-14:48:13.472.732 [mindspore/ops/kernel/ascend/acl_ir/op_api_exec.cc:169] GetAscendDefaultCustomPath] Checking whether the so exists or if permission to access it is available: /usr/local/Ascend/ascend-toolkit/latest/opp/vendors/customize_vision/op_api/lib/libcust_opapi.so 2025-10-23 14:48:14,547 [WARNING] Parse Model, args: nearest, keep str type 2025-10-23 14:48:14,558 [WARNING] Parse Model, args: nearest, keep str type 2025-10-23 14:48:14,646 [INFO] number of network params, total: 11.160279M, trainable: 11.140228M .2025-10-23 14:48:30,416 [INFO] ema_weight not exist, default pretrain weight is currently used. 2025-10-23 14:48:30,421 [INFO] No dataset cache available, caching now... Scanning images: 0%| | 0/6471 [00:00<?, ?it/s]WARNING ⚠️ /root/workspace/mindyolo/visdrone_COCO_format/train/images/000000000335.jpg: 1 duplicate labels removed Scanning '/root/workspace/mindyolo/visdrone_COCO_format/train.cache' images and labels... 397 found, 0 missing, 0 empty, 0 corrupted: 6%|████ | 397/6471 [00:00<00:01, 3960.91it/s]WARNING ⚠️ /root/workspace/mindyolo/visdrone_COCO_format/train/images/000000000427.jpg: 1 duplicate labels removed Scanning '/root/workspace/mindyolo/visdrone_COCO_format/train.cache' images and labels... 1261 found, 0 missing, 0 empty, 0 corrupted: 19%|████████████▋ | 1261/6471 [00:00<00:01, 4238.38it/s]WARNING ⚠️ /root/workspace/mindyolo/visdrone_COCO_format/train/images/000000001492.jpg: 1 duplicate labels removed Scanning '/root/workspace/mindyolo/visdrone_COCO_format/train.cache' images and labels... 3866 found, 0 missing, 0 empty, 0 corrupted: 60%|██████████████████████████████████████▊ | 3866/6471 [00:00<00:00, 4332.85it/s]WARNING ⚠️ /root/workspace/mindyolo/visdrone_COCO_format/train/images/000000003868.jpg: 1 duplicate labels removed Scanning '/root/workspace/mindyolo/visdrone_COCO_format/train.cache' images and labels... 5607 found, 0 missing, 0 empty, 0 corrupted: 87%|████████████████████████████████████████████████████████▎ | 5607/6471 [00:01<00:00, 4337.04it/s]WARNING ⚠️ /root/workspace/mindyolo/visdrone_COCO_format/train/images/000000005742.jpg: 1 duplicate labels removed Scanning '/root/workspace/mindyolo/visdrone_COCO_format/train.cache' images and labels... 6471 found, 0 missing, 0 empty, 0 corrupted: 100%|█████████████████████████████████████████████████████████████████| 6471/6471 [00:01<00:00, 4307.45it/s] 2025-10-23 14:48:32,028 [INFO] New cache created: /root/workspace/mindyolo/visdrone_COCO_format/train.cache.npy 2025-10-23 14:48:32,029 [INFO] Dataset caching success. 2025-10-23 14:48:32,051 [INFO] Dataloader num parallel workers: [8] 2025-10-23 14:48:32,135 [INFO] Dataset Cache file hash/version check success. 2025-10-23 14:48:32,135 [INFO] Load dataset cache from [/root/workspace/mindyolo/visdrone_COCO_format/train.cache.npy] success. Scanning '/root/workspace/mindyolo/visdrone_COCO_format/train.cache.npy' images and labels... 6471 found, 0 missing, 0 empty, 0 corrupted: 100%|███████████████████████████████████████████████████████████████████████| 6471/6471 [00:00<?, ?it/s] 2025-10-23 14:48:32,157 [INFO] Dataloader num parallel workers: [8] 2025-10-23 14:48:32,273 [INFO] Registry(name=callback, total=4) 2025-10-23 14:48:32,273 [INFO] (0): YoloxSwitchTrain in mindyolo/utils/callback.py 2025-10-23 14:48:32,273 [INFO] (1): EvalWhileTrain in mindyolo/utils/callback.py 2025-10-23 14:48:32,273 [INFO] (2): SummaryCallback in mindyolo/utils/callback.py 2025-10-23 14:48:32,273 [INFO] (3): ProfilerCallback in mindyolo/utils/callback.py 2025-10-23 14:48:32,273 [INFO] 2025-10-23 14:48:32,276 [INFO] got 1 active callback as follows: 2025-10-23 14:48:32,276 [INFO] SummaryCallback() 2025-10-23 14:48:32,276 [WARNING] The first epoch will be compiled for the graph, which may take a long time; You can come back later :). albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success [INFO] albumentations load success [INFO] albumentations load success [INFO] albumentations load success [INFO] albumentations load success [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8)) [INFO] albumentations load success .........2025-10-23 14:52:54,293 [INFO] Epoch 1/10, Step 100/404, imgsize (1024, 1024), loss: 5.5585, lbox: 3.2052, lcls: 0.4855, dfl: 1.8678, cur_lr: 0.09257426112890244 2025-10-23 14:52:55,203 [INFO] Epoch 1/10, Step 100/404, step time: 2629.27 ms 2025-10-23 14:55:40,115 [INFO] Epoch 1/10, Step 200/404, imgsize (1024, 1024), loss: 4.5693, lbox: 2.5884, lcls: 0.4230, dfl: 1.5578, cur_lr: 0.08514851331710815 2025-10-23 14:55:40,138 [INFO] Epoch 1/10, Step 200/404, step time: 1649.36 ms 2025-10-23 14:58:25,055 [INFO] Epoch 1/10, Step 300/404, imgsize (1024, 1024), loss: 3.9681, lbox: 2.1428, lcls: 0.3853, dfl: 1.4400, cur_lr: 0.07772277295589447 2025-10-23 14:58:25,078 [INFO] Epoch 1/10, Step 300/404, step time: 1649.39 ms 2025-10-23 15:01:10,020 [INFO] Epoch 1/10, Step 400/404, imgsize (1024, 1024), loss: 3.6795, lbox: 2.0528, lcls: 0.3339, dfl: 1.2929, cur_lr: 0.07029703259468079 2025-10-23 15:01:10,044 [INFO] Epoch 1/10, Step 400/404, step time: 1649.65 ms 2025-10-23 15:01:17,111 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-1_404.ckpt 2025-10-23 15:01:17,111 [INFO] Epoch 1/10, epoch time: 12.75 min. 2025-10-23 15:04:02,010 [INFO] Epoch 2/10, Step 100/404, imgsize (1024, 1024), loss: 3.5361, lbox: 1.9678, lcls: 0.3183, dfl: 1.2500, cur_lr: 0.062162574380636215 2025-10-23 15:04:02,018 [INFO] Epoch 2/10, Step 100/404, step time: 1649.07 ms 2025-10-23 15:06:46,939 [INFO] Epoch 2/10, Step 200/404, imgsize (1024, 1024), loss: 3.3767, lbox: 1.8395, lcls: 0.3042, dfl: 1.2329, cur_lr: 0.05465514957904816 2025-10-23 15:06:46,947 [INFO] Epoch 2/10, Step 200/404, step time: 1649.28 ms 2025-10-23 15:09:31,885 [INFO] Epoch 2/10, Step 300/404, imgsize (1024, 1024), loss: 3.3604, lbox: 1.8753, lcls: 0.3134, dfl: 1.1718, cur_lr: 0.0471477210521698 2025-10-23 15:09:31,894 [INFO] Epoch 2/10, Step 300/404, step time: 1649.46 ms 2025-10-23 15:12:16,806 [INFO] Epoch 2/10, Step 400/404, imgsize (1024, 1024), loss: 3.2902, lbox: 1.8262, lcls: 0.2795, dfl: 1.1846, cur_lr: 0.03964029625058174 2025-10-23 15:12:16,814 [INFO] Epoch 2/10, Step 400/404, step time: 1649.20 ms 2025-10-23 15:12:23,860 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-2_404.ckpt 2025-10-23 15:12:23,860 [INFO] Epoch 2/10, epoch time: 11.11 min. 2025-10-23 15:15:08,782 [INFO] Epoch 3/10, Step 100/404, imgsize (1024, 1024), loss: 3.3220, lbox: 1.7991, lcls: 0.3124, dfl: 1.2106, cur_lr: 0.031090890988707542 2025-10-23 15:15:08,791 [INFO] Epoch 3/10, Step 100/404, step time: 1649.30 ms 2025-10-23 15:17:53,703 [INFO] Epoch 3/10, Step 200/404, imgsize (1024, 1024), loss: 3.1162, lbox: 1.6879, lcls: 0.2824, dfl: 1.1460, cur_lr: 0.02350178174674511 2025-10-23 15:17:53,711 [INFO] Epoch 3/10, Step 200/404, step time: 1649.20 ms 2025-10-23 15:20:38,631 [INFO] Epoch 3/10, Step 300/404, imgsize (1024, 1024), loss: 3.0332, lbox: 1.6024, lcls: 0.2703, dfl: 1.1605, cur_lr: 0.015912672504782677 2025-10-23 15:20:38,639 [INFO] Epoch 3/10, Step 300/404, step time: 1649.28 ms 2025-10-23 15:23:23,580 [INFO] Epoch 3/10, Step 400/404, imgsize (1024, 1024), loss: 3.1371, lbox: 1.7095, lcls: 0.2808, dfl: 1.1469, cur_lr: 0.008323564194142818 2025-10-23 15:23:23,589 [INFO] Epoch 3/10, Step 400/404, step time: 1649.49 ms 2025-10-23 15:23:30,617 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-3_404.ckpt 2025-10-23 15:23:30,617 [INFO] Epoch 3/10, epoch time: 11.11 min. 2025-10-23 15:26:15,527 [INFO] Epoch 4/10, Step 100/404, imgsize (1024, 1024), loss: 3.2965, lbox: 1.8179, lcls: 0.2614, dfl: 1.2172, cur_lr: 0.007029999978840351 2025-10-23 15:26:15,535 [INFO] Epoch 4/10, Step 100/404, step time: 1649.18 ms 2025-10-23 15:29:00,451 [INFO] Epoch 4/10, Step 200/404, imgsize (1024, 1024), loss: 3.1855, lbox: 1.7697, lcls: 0.2504, dfl: 1.1654, cur_lr: 0.007029999978840351 2025-10-23 15:29:00,459 [INFO] Epoch 4/10, Step 200/404, step time: 1649.24 ms 2025-10-23 15:31:45,369 [INFO] Epoch 4/10, Step 300/404, imgsize (1024, 1024), loss: 2.9900, lbox: 1.6270, lcls: 0.2307, dfl: 1.1323, cur_lr: 0.007029999978840351 2025-10-23 15:31:45,378 [INFO] Epoch 4/10, Step 300/404, step time: 1649.18 ms 2025-10-23 15:34:30,277 [INFO] Epoch 4/10, Step 400/404, imgsize (1024, 1024), loss: 3.1742, lbox: 1.7506, lcls: 0.2590, dfl: 1.1646, cur_lr: 0.007029999978840351 2025-10-23 15:34:30,285 [INFO] Epoch 4/10, Step 400/404, step time: 1649.07 ms 2025-10-23 15:34:37,315 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-4_404.ckpt 2025-10-23 15:34:37,316 [INFO] Epoch 4/10, epoch time: 11.11 min. 2025-10-23 15:37:22,195 [INFO] Epoch 5/10, Step 100/404, imgsize (1024, 1024), loss: 2.9632, lbox: 1.6123, lcls: 0.2424, dfl: 1.1085, cur_lr: 0.006039999891072512 2025-10-23 15:37:22,204 [INFO] Epoch 5/10, Step 100/404, step time: 1648.88 ms 2025-10-23 15:40:07,094 [INFO] Epoch 5/10, Step 200/404, imgsize (1024, 1024), loss: 2.7776, lbox: 1.4777, lcls: 0.2025, dfl: 1.0975, cur_lr: 0.006039999891072512 2025-10-23 15:40:07,103 [INFO] Epoch 5/10, Step 200/404, step time: 1648.99 ms 2025-10-23 15:42:52,021 [INFO] Epoch 5/10, Step 300/404, imgsize (1024, 1024), loss: 2.7209, lbox: 1.4253, lcls: 0.2130, dfl: 1.0826, cur_lr: 0.006039999891072512 2025-10-23 15:42:52,029 [INFO] Epoch 5/10, Step 300/404, step time: 1649.26 ms 2025-10-23 15:45:36,965 [INFO] Epoch 5/10, Step 400/404, imgsize (1024, 1024), loss: 2.7360, lbox: 1.4817, lcls: 0.2157, dfl: 1.0387, cur_lr: 0.006039999891072512 2025-10-23 15:45:36,973 [INFO] Epoch 5/10, Step 400/404, step time: 1649.44 ms 2025-10-23 15:45:44,037 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-5_404.ckpt 2025-10-23 15:45:44,037 [INFO] Epoch 5/10, epoch time: 11.11 min. 2025-10-23 15:48:28,914 [INFO] Epoch 6/10, Step 100/404, imgsize (1024, 1024), loss: 2.6675, lbox: 1.4472, lcls: 0.2042, dfl: 1.0161, cur_lr: 0.005049999803304672 2025-10-23 15:48:28,923 [INFO] Epoch 6/10, Step 100/404, step time: 1648.85 ms 2025-10-23 15:51:13,798 [INFO] Epoch 6/10, Step 200/404, imgsize (1024, 1024), loss: 2.7114, lbox: 1.4235, lcls: 0.1986, dfl: 1.0893, cur_lr: 0.005049999803304672 2025-10-23 15:51:13,807 [INFO] Epoch 6/10, Step 200/404, step time: 1648.84 ms 2025-10-23 15:53:58,688 [INFO] Epoch 6/10, Step 300/404, imgsize (1024, 1024), loss: 2.6783, lbox: 1.4169, lcls: 0.1985, dfl: 1.0629, cur_lr: 0.005049999803304672 2025-10-23 15:53:58,697 [INFO] Epoch 6/10, Step 300/404, step time: 1648.90 ms 2025-10-23 15:56:43,578 [INFO] Epoch 6/10, Step 400/404, imgsize (1024, 1024), loss: 2.7539, lbox: 1.4734, lcls: 0.2037, dfl: 1.0768, cur_lr: 0.005049999803304672 2025-10-23 15:56:43,586 [INFO] Epoch 6/10, Step 400/404, step time: 1648.89 ms 2025-10-23 15:56:50,613 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-6_404.ckpt 2025-10-23 15:56:50,613 [INFO] Epoch 6/10, epoch time: 11.11 min. 2025-10-23 15:59:35,561 [INFO] Epoch 7/10, Step 100/404, imgsize (1024, 1024), loss: 2.9109, lbox: 1.6203, lcls: 0.2210, dfl: 1.0696, cur_lr: 0.00406000018119812 2025-10-23 15:59:35,569 [INFO] Epoch 7/10, Step 100/404, step time: 1649.56 ms 2025-10-23 16:02:20,470 [INFO] Epoch 7/10, Step 200/404, imgsize (1024, 1024), loss: 2.6941, lbox: 1.4727, lcls: 0.2068, dfl: 1.0147, cur_lr: 0.00406000018119812 2025-10-23 16:02:20,479 [INFO] Epoch 7/10, Step 200/404, step time: 1649.10 ms 2025-10-23 16:05:05,384 [INFO] Epoch 7/10, Step 300/404, imgsize (1024, 1024), loss: 2.8098, lbox: 1.4810, lcls: 0.2188, dfl: 1.1101, cur_lr: 0.00406000018119812 2025-10-23 16:05:05,391 [INFO] Epoch 7/10, Step 300/404, step time: 1649.12 ms 2025-10-23 16:07:50,302 [INFO] Epoch 7/10, Step 400/404, imgsize (1024, 1024), loss: 2.8426, lbox: 1.5529, lcls: 0.2108, dfl: 1.0788, cur_lr: 0.00406000018119812 2025-10-23 16:07:50,310 [INFO] Epoch 7/10, Step 400/404, step time: 1649.18 ms 2025-10-23 16:07:57,341 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-7_404.ckpt 2025-10-23 16:07:57,342 [INFO] Epoch 7/10, epoch time: 11.11 min. 2025-10-23 16:10:42,225 [INFO] Epoch 8/10, Step 100/404, imgsize (1024, 1024), loss: 2.4095, lbox: 1.2257, lcls: 0.1704, dfl: 1.0134, cur_lr: 0.0030700000934302807 2025-10-23 16:10:42,233 [INFO] Epoch 8/10, Step 100/404, step time: 1648.92 ms 2025-10-23 16:13:27,126 [INFO] Epoch 8/10, Step 200/404, imgsize (1024, 1024), loss: 2.6034, lbox: 1.3788, lcls: 0.1872, dfl: 1.0374, cur_lr: 0.0030700000934302807 2025-10-23 16:13:27,134 [INFO] Epoch 8/10, Step 200/404, step time: 1649.00 ms 2025-10-23 16:16:12,032 [INFO] Epoch 8/10, Step 300/404, imgsize (1024, 1024), loss: 2.6074, lbox: 1.3916, lcls: 0.1787, dfl: 1.0371, cur_lr: 0.0030700000934302807 2025-10-23 16:16:12,041 [INFO] Epoch 8/10, Step 300/404, step time: 1649.07 ms 2025-10-23 16:18:56,946 [INFO] Epoch 8/10, Step 400/404, imgsize (1024, 1024), loss: 2.8867, lbox: 1.4981, lcls: 0.2189, dfl: 1.1697, cur_lr: 0.0030700000934302807 2025-10-23 16:18:56,954 [INFO] Epoch 8/10, Step 400/404, step time: 1649.13 ms 2025-10-23 16:19:03,973 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-8_404.ckpt 2025-10-23 16:19:03,973 [INFO] Epoch 8/10, epoch time: 11.11 min. 2025-10-23 16:21:48,883 [INFO] Epoch 9/10, Step 100/404, imgsize (1024, 1024), loss: 2.8544, lbox: 1.6248, lcls: 0.2181, dfl: 1.0115, cur_lr: 0.0020800000056624413 2025-10-23 16:21:48,891 [INFO] Epoch 9/10, Step 100/404, step time: 1649.18 ms 2025-10-23 16:24:33,791 [INFO] Epoch 9/10, Step 200/404, imgsize (1024, 1024), loss: 2.9393, lbox: 1.6026, lcls: 0.2223, dfl: 1.1145, cur_lr: 0.0020800000056624413 2025-10-23 16:24:33,799 [INFO] Epoch 9/10, Step 200/404, step time: 1649.08 ms 2025-10-23 16:27:18,695 [INFO] Epoch 9/10, Step 300/404, imgsize (1024, 1024), loss: 2.4632, lbox: 1.2884, lcls: 0.1701, dfl: 1.0047, cur_lr: 0.0020800000056624413 2025-10-23 16:27:18,703 [INFO] Epoch 9/10, Step 300/404, step time: 1649.04 ms 2025-10-23 16:30:03,567 [INFO] Epoch 9/10, Step 400/404, imgsize (1024, 1024), loss: 2.7216, lbox: 1.4867, lcls: 0.2002, dfl: 1.0346, cur_lr: 0.0020800000056624413 2025-10-23 16:30:03,575 [INFO] Epoch 9/10, Step 400/404, step time: 1648.72 ms 2025-10-23 16:30:10,627 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-9_404.ckpt 2025-10-23 16:30:10,627 [INFO] Epoch 9/10, epoch time: 11.11 min. 2025-10-23 16:32:55,537 [INFO] Epoch 10/10, Step 100/404, imgsize (1024, 1024), loss: 2.5899, lbox: 1.4239, lcls: 0.1668, dfl: 0.9992, cur_lr: 0.0010900000343099236 2025-10-23 16:32:55,545 [INFO] Epoch 10/10, Step 100/404, step time: 1649.18 ms 2025-10-23 16:35:40,433 [INFO] Epoch 10/10, Step 200/404, imgsize (1024, 1024), loss: 2.5535, lbox: 1.3745, lcls: 0.1813, dfl: 0.9976, cur_lr: 0.0010900000343099236 2025-10-23 16:35:40,441 [INFO] Epoch 10/10, Step 200/404, step time: 1648.95 ms 2025-10-23 16:38:25,358 [INFO] Epoch 10/10, Step 300/404, imgsize (1024, 1024), loss: 2.4509, lbox: 1.2441, lcls: 0.1717, dfl: 1.0351, cur_lr: 0.0010900000343099236 2025-10-23 16:38:25,366 [INFO] Epoch 10/10, Step 300/404, step time: 1649.25 ms 2025-10-23 16:41:10,260 [INFO] Epoch 10/10, Step 400/404, imgsize (1024, 1024), loss: 2.6832, lbox: 1.4217, lcls: 0.1896, dfl: 1.0719, cur_lr: 0.0010900000343099236 2025-10-23 16:41:10,268 [INFO] Epoch 10/10, Step 400/404, step time: 1649.02 ms 2025-10-23 16:41:17,324 [INFO] Saving model to ./runs/2025.10.23-14.48.02/weights/yolov8s-10_404.ckpt 2025-10-23 16:41:17,324 [INFO] Epoch 10/10, epoch time: 11.11 min. 2025-10-23 16:41:17,742 [INFO] End Train. 2025-10-23 16:41:18,446 [INFO] Training completed.平均每个epoch耗时约10min左右,在训练过程中我们也可以查看AI Core的利用率以及内存的占用情况:npu-smi info+--------------------------------------------------------------------------------------------------------+ | npu-smi v1.0 Version: 24.1.rc4.b999 | +-------------------------------+-----------------+------------------------------------------------------+ | NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page) | | Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) | +===============================+=================+======================================================+ | 30208 310P1 | OK | NA 52 11372 / 11372 | | 0 0 | 0000:77:00.0 | 99 24288/ 89608 | +-------------------------------+-----------------+------------------------------------------------------+ | 30208 310P1 | OK | NA 42 0 / 0 | | 1 1 | 0000:77:00.0 | 0 1576 / 89085 | +===============================+=================+======================================================+ +-------------------------------+-----------------+------------------------------------------------------+ | NPU Chip | Process id | Process name | Process memory(MB) | +===============================+=================+======================================================+ | 30208 0 | 336686 | python3 | 22835 | +===============================+=================+======================================================+四、模型验证这里我们仅训练了10个epoch进行模型的验证,可以看到模型的精度和召回率如下:python3 test.py --config ./configs/yolov8/yolov8s.yaml --device_target Ascend --weight ./runs/2025.10.23-14.48.02/weights/yolov8s-10_404.ckpt2025-10-23 16:46:18,824 [INFO] parse_args: 2025-10-23 16:46:18,824 [INFO] task detect 2025-10-23 16:46:18,824 [INFO] device_target Ascend 2025-10-23 16:46:18,824 [INFO] ms_mode 0 2025-10-23 16:46:18,824 [INFO] ms_amp_level O0 2025-10-23 16:46:18,824 [INFO] ms_enable_graph_kernel False 2025-10-23 16:46:18,824 [INFO] precision_mode None 2025-10-23 16:46:18,824 [INFO] weight ./runs/2025.10.23-14.48.02/weights/yolov8s-10_404.ckpt 2025-10-23 16:46:18,824 [INFO] per_batch_size 16 2025-10-23 16:46:18,824 [INFO] img_size 1024 2025-10-23 16:46:18,824 [INFO] single_cls False 2025-10-23 16:46:18,824 [INFO] rect False 2025-10-23 16:46:18,824 [INFO] exec_nms True 2025-10-23 16:46:18,824 [INFO] nms_time_limit 60.0 2025-10-23 16:46:18,824 [INFO] conf_thres 0.001 2025-10-23 16:46:18,824 [INFO] iou_thres 0.7 2025-10-23 16:46:18,824 [INFO] conf_free True 2025-10-23 16:46:18,824 [INFO] seed 2 2025-10-23 16:46:18,824 [INFO] log_level INFO 2025-10-23 16:46:18,824 [INFO] save_dir ./runs_test/2025.10.23-16.46.18 2025-10-23 16:46:18,824 [INFO] enable_modelarts False 2025-10-23 16:46:18,824 [INFO] data_url 2025-10-23 16:46:18,824 [INFO] ckpt_url 2025-10-23 16:46:18,824 [INFO] train_url 2025-10-23 16:46:18,824 [INFO] data_dir /cache/data/ 2025-10-23 16:46:18,824 [INFO] is_parallel False 2025-10-23 16:46:18,824 [INFO] ckpt_dir /cache/pretrain_ckpt/ 2025-10-23 16:46:18,824 [INFO] data.dataset_name result 2025-10-23 16:46:18,824 [INFO] data.train_set /root/workspace/mindyolo/visdrone_COCO_format/train.txt 2025-10-23 16:46:18,824 [INFO] data.val_set /root/workspace/mindyolo/visdrone_COCO_format/val.txt 2025-10-23 16:46:18,824 [INFO] data.test_set /root/workspace/mindyolo/visdrone_COCO_format/val.txt 2025-10-23 16:46:18,824 [INFO] data.nc 12 2025-10-23 16:46:18,824 [INFO] data.names ['ignored regions', 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor', 'others'] 2025-10-23 16:46:18,824 [INFO] train_transforms.stage_epochs [5, 5] 2025-10-23 16:46:18,824 [INFO] train_transforms.trans_list [[{'func_name': 'mosaic', 'prob': 1.0}, {'func_name': 'resample_segments'}, {'func_name': 'random_perspective', 'prob': 1.0, 'degrees': 0.0, 'translate': 0.1, 'scale': 0.5, 'shear': 0.0}, {'func_name': 'albumentations'}, {'func_name': 'hsv_augment', 'prob': 1.0, 'hgain': 0.015, 'sgain': 0.7, 'vgain': 0.4}, {'func_name': 'fliplr', 'prob': 0.5}, {'func_name': 'label_norm', 'xyxy2xywh_': True}, {'func_name': 'label_pad', 'padding_size': 160, 'padding_value': -1}, {'func_name': 'image_norm', 'scale': 255.0}, {'func_name': 'image_transpose', 'bgr2rgb': True, 'hwc2chw': True}], [{'func_name': 'letterbox', 'scaleup': True}, {'func_name': 'resample_segments'}, {'func_name': 'random_perspective', 'prob': 1.0, 'degrees': 0.0, 'translate': 0.1, 'scale': 0.5, 'shear': 0.0}, {'func_name': 'albumentations'}, {'func_name': 'hsv_augment', 'prob': 1.0, 'hgain': 0.015, 'sgain': 0.7, 'vgain': 0.4}, {'func_name': 'fliplr', 'prob': 0.5}, {'func_name': 'label_norm', 'xyxy2xywh_': True}, {'func_name': 'label_pad', 'padding_size': 160, 'padding_value': -1}, {'func_name': 'image_norm', 'scale': 255.0}, {'func_name': 'image_transpose', 'bgr2rgb': True, 'hwc2chw': True}]] 2025-10-23 16:46:18,824 [INFO] data.test_transforms [{'func_name': 'letterbox', 'scaleup': False, 'only_image': True}, {'func_name': 'image_norm', 'scale': 255.0}, {'func_name': 'image_transpose', 'bgr2rgb': True, 'hwc2chw': True}] 2025-10-23 16:46:18,824 [INFO] data.num_parallel_workers 8 2025-10-23 16:46:18,824 [INFO] optimizer.optimizer momentum 2025-10-23 16:46:18,824 [INFO] optimizer.lr_init 0.01 2025-10-23 16:46:18,824 [INFO] optimizer.momentum 0.937 2025-10-23 16:46:18,824 [INFO] optimizer.nesterov True 2025-10-23 16:46:18,824 [INFO] optimizer.loss_scale 1.0 2025-10-23 16:46:18,824 [INFO] optimizer.warmup_epochs 3 2025-10-23 16:46:18,824 [INFO] optimizer.warmup_momentum 0.8 2025-10-23 16:46:18,824 [INFO] optimizer.warmup_bias_lr 0.1 2025-10-23 16:46:18,824 [INFO] optimizer.min_warmup_step 1000 2025-10-23 16:46:18,824 [INFO] optimizer.group_param yolov8 2025-10-23 16:46:18,824 [INFO] optimizer.gp_weight_decay 0.0005 2025-10-23 16:46:18,824 [INFO] optimizer.start_factor 1.0 2025-10-23 16:46:18,824 [INFO] optimizer.end_factor 0.01 2025-10-23 16:46:18,824 [INFO] loss.name YOLOv8Loss 2025-10-23 16:46:18,824 [INFO] loss.box 7.5 2025-10-23 16:46:18,824 [INFO] loss.cls 0.5 2025-10-23 16:46:18,824 [INFO] loss.dfl 1.5 2025-10-23 16:46:18,824 [INFO] loss.reg_max 16 2025-10-23 16:46:18,824 [INFO] epochs 10 2025-10-23 16:46:18,824 [INFO] sync_bn True 2025-10-23 16:46:18,824 [INFO] anchor_base False 2025-10-23 16:46:18,824 [INFO] opencv_threads_num 0 2025-10-23 16:46:18,824 [INFO] network.model_name yolov8 2025-10-23 16:46:18,824 [INFO] network.nc 80 2025-10-23 16:46:18,824 [INFO] network.reg_max 16 2025-10-23 16:46:18,824 [INFO] network.stride [8, 16, 32] 2025-10-23 16:46:18,824 [INFO] network.backbone [[-1, 1, 'ConvNormAct', [64, 3, 2]], [-1, 1, 'ConvNormAct', [128, 3, 2]], [-1, 3, 'C2f', [128, True]], [-1, 1, 'ConvNormAct', [256, 3, 2]], [-1, 6, 'C2f', [256, True]], [-1, 1, 'ConvNormAct', [512, 3, 2]], [-1, 6, 'C2f', [512, True]], [-1, 1, 'ConvNormAct', [1024, 3, 2]], [-1, 3, 'C2f', [1024, True]], [-1, 1, 'SPPF', [1024, 5]]] 2025-10-23 16:46:18,824 [INFO] network.head [[-1, 1, 'Upsample', ['None', 2, 'nearest']], [[-1, 6], 1, 'Concat', [1]], [-1, 3, 'C2f', [512]], [-1, 1, 'Upsample', ['None', 2, 'nearest']], [[-1, 4], 1, 'Concat', [1]], [-1, 3, 'C2f', [256]], [-1, 1, 'ConvNormAct', [256, 3, 2]], [[-1, 12], 1, 'Concat', [1]], [-1, 3, 'C2f', [512]], [-1, 1, 'ConvNormAct', [512, 3, 2]], [[-1, 9], 1, 'Concat', [1]], [-1, 3, 'C2f', [1024]], [[15, 18, 21], 1, 'YOLOv8Head', ['nc', 'reg_max', 'stride']]] 2025-10-23 16:46:18,824 [INFO] network.depth_multiple 0.33 2025-10-23 16:46:18,824 [INFO] network.width_multiple 0.5 2025-10-23 16:46:18,824 [INFO] network.max_channels 1024 2025-10-23 16:46:18,824 [INFO] overflow_still_update False 2025-10-23 16:46:18,824 [INFO] config ./configs/yolov8/yolov8s.yaml 2025-10-23 16:46:18,824 [INFO] rank 0 2025-10-23 16:46:18,824 [INFO] rank_size 1 2025-10-23 16:46:18,824 [INFO] 2025-10-23 16:46:18,898 [WARNING] Parse Model, args: nearest, keep str type 2025-10-23 16:46:18,909 [WARNING] Parse Model, args: nearest, keep str type 2025-10-23 16:46:18,984 [INFO] number of network params, total: 11.160279M, trainable: 11.140228M [WARNING] GE_ADPT(540183,7efcd8e26740,python3):2025-10-23-16:46:22.493.658 [mindspore/ops/kernel/ascend/acl_ir/op_api_exec.cc:169] GetAscendDefaultCustomPath] Checking whether the so exists or if permission to access it is available: /usr/local/Ascend/ascend-toolkit/latest/opp/vendors/customize_vision/op_api/lib/libcust_opapi.so 2025-10-23 16:46:23,434 [INFO] Load checkpoint from [./runs/2025.10.23-14.48.02/weights/yolov8s-10_404.ckpt] success. 2025-10-23 16:46:23,437 [INFO] No dataset cache available, caching now... Scanning '/root/workspace/mindyolo/visdrone_COCO_format/val.cache' images and labels... 548 found, 0 missing, 0 empty, 0 corrupted: 100%|█████████████████████████████████████████████████████████████████████████████| 548/548 [00:00<00:00, 3754.44it/s] 2025-10-23 16:46:23,595 [INFO] New cache created: /root/workspace/mindyolo/visdrone_COCO_format/val.cache.npy 2025-10-23 16:46:23,595 [INFO] Dataset caching success. 2025-10-23 16:46:23,597 [INFO] Dataloader num parallel workers: [8] 2025-10-23 16:46:23,607 [WARNING] unable to load fast_coco_eval api, use normal one instead Warning: tiling offset out of range, index: 32 ..2025-10-23 16:46:55,297 [INFO] Sample 35/1, time cost: 30512.14 ms. 2025-10-23 16:46:57,108 [INFO] Sample 35/2, time cost: 1722.38 ms. 2025-10-23 16:46:58,628 [INFO] Sample 35/3, time cost: 1420.95 ms. 2025-10-23 16:47:00,538 [INFO] Sample 35/4, time cost: 1809.91 ms. 2025-10-23 16:47:02,502 [INFO] Sample 35/5, time cost: 1865.30 ms. 2025-10-23 16:47:04,321 [INFO] Sample 35/6, time cost: 1718.46 ms. 2025-10-23 16:47:06,724 [INFO] Sample 35/7, time cost: 2303.35 ms. 2025-10-23 16:47:08,940 [INFO] Sample 35/8, time cost: 2117.25 ms. 2025-10-23 16:47:11,018 [INFO] Sample 35/9, time cost: 1978.46 ms. 2025-10-23 16:47:13,101 [INFO] Sample 35/10, time cost: 1982.41 ms. 2025-10-23 16:47:14,871 [INFO] Sample 35/11, time cost: 1671.05 ms. 2025-10-23 16:47:17,112 [INFO] Sample 35/12, time cost: 2140.79 ms. 2025-10-23 16:47:19,142 [INFO] Sample 35/13, time cost: 1930.53 ms. 2025-10-23 16:47:20,984 [INFO] Sample 35/14, time cost: 1741.35 ms. 2025-10-23 16:47:23,393 [INFO] Sample 35/15, time cost: 2307.50 ms. 2025-10-23 16:47:25,557 [INFO] Sample 35/16, time cost: 2060.89 ms. 2025-10-23 16:47:27,324 [INFO] Sample 35/17, time cost: 1664.00 ms. 2025-10-23 16:47:29,254 [INFO] Sample 35/18, time cost: 1824.31 ms. 2025-10-23 16:47:31,281 [INFO] Sample 35/19, time cost: 1921.78 ms. 2025-10-23 16:47:33,331 [INFO] Sample 35/20, time cost: 1942.85 ms. 2025-10-23 16:47:35,806 [INFO] Sample 35/21, time cost: 2368.87 ms. 2025-10-23 16:47:38,165 [INFO] Sample 35/22, time cost: 2255.00 ms. 2025-10-23 16:47:40,453 [INFO] Sample 35/23, time cost: 2182.96 ms. 2025-10-23 16:47:42,588 [INFO] Sample 35/24, time cost: 2029.14 ms. 2025-10-23 16:47:44,490 [INFO] Sample 35/25, time cost: 1796.02 ms. 2025-10-23 16:47:46,804 [INFO] Sample 35/26, time cost: 2207.91 ms. 2025-10-23 16:47:49,181 [INFO] Sample 35/27, time cost: 2270.69 ms. 2025-10-23 16:47:50,926 [INFO] Sample 35/28, time cost: 1638.70 ms. 2025-10-23 16:47:53,079 [INFO] Sample 35/29, time cost: 2046.37 ms. 2025-10-23 16:47:55,061 [INFO] Sample 35/30, time cost: 1875.28 ms. 2025-10-23 16:47:57,140 [INFO] Sample 35/31, time cost: 1972.00 ms. 2025-10-23 16:47:59,895 [INFO] Sample 35/32, time cost: 2647.24 ms. 2025-10-23 16:48:02,196 [INFO] Sample 35/33, time cost: 2191.50 ms. 2025-10-23 16:48:04,739 [INFO] Sample 35/34, time cost: 2434.77 ms. ..2025-10-23 16:48:20,509 [INFO] Sample 35/35, time cost: 15723.18 ms. 2025-10-23 16:48:20,509 [INFO] loading annotations into memory... 2025-10-23 16:48:20,639 [INFO] Done (t=0.13s) 2025-10-23 16:48:20,639 [INFO] creating index... 2025-10-23 16:48:20,650 [INFO] index created! 2025-10-23 16:48:20,650 [INFO] Loading and preparing results... 2025-10-23 16:48:21,106 [INFO] DONE (t=0.46s) 2025-10-23 16:48:21,106 [INFO] creating index... 2025-10-23 16:48:21,134 [INFO] index created! 2025-10-23 16:48:21,135 [INFO] Running per image evaluation... 2025-10-23 16:48:21,135 [INFO] Evaluate annotation type *bbox* 2025-10-23 16:48:31,087 [INFO] DONE (t=9.95s). 2025-10-23 16:48:31,087 [INFO] Accumulating evaluation results... 2025-10-23 16:48:31,996 [INFO] DONE (t=0.91s). 2025-10-23 16:48:31,996 [INFO] Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.019 2025-10-23 16:48:31,996 [INFO] Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.036 2025-10-23 16:48:31,996 [INFO] Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.019 2025-10-23 16:48:31,996 [INFO] Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.016 2025-10-23 16:48:31,997 [INFO] Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.024 2025-10-23 16:48:31,997 [INFO] Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.056 2025-10-23 16:48:31,997 [INFO] Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.009 2025-10-23 16:48:31,997 [INFO] Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.049 2025-10-23 16:48:31,997 [INFO] Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.076 2025-10-23 16:48:31,997 [INFO] Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.057 2025-10-23 16:48:31,997 [INFO] Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.106 2025-10-23 16:48:31,997 [INFO] Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.161 2025-10-23 16:48:31,997 [INFO] Speed: 99.0/100.3/199.3 ms inference/NMS/total per 1024x1024 image at batch-size 16; 2025-10-23 16:48:31,997 [INFO] Testing completed, cost 133.18s.使用predict.py测试训练模型参数的结果并进行可视化推理,运行方式如下:python3 examples/finetune_visdrone/predict.py --config ./configs/yolov8/yolov8s.yaml --weight=./runs/2025.10.23-14.48.02/weights/yolov8s-120_404.ckpt --image_path ./visdrone_COCO_format/val/images/000000000001.jpg训练120个epoch后,模型的推理效果如下:五、小结本文详细阐述了在OrangePi AI Studio Pro上基于昇腾310P使用MindYolo框架实现YOLOv8模型训练与验证的完整流程,涵盖环境准备、数据集格式转换、模型训练参数配置及性能评估。
  • [技术干货] 【朝推夜训】Ascend310p YOLOv8 NPU 训练和推理
    【朝推夜训】Ascend310p YOLOv8 NPU 训练和推理在华为昇思MindSpore框架的加持下,我们在OrangePi AI Studio Pro开发板上实现YOLOv8m模型的完整训练流程。在单块NPU上训练YOLOv8m模型,每轮训练7000张图像仅需6.92分钟,10轮训练总耗时约69分钟。从训练日志可以看出,模型损失值loss从第一轮的6.45逐步下降到最后一轮的2.58左右,表明模型训练效果良好。训练过程中,NPU的AICore利用率和内存占用情况都保持在合理水平,证明了Ascend 310P芯片在目标检测任务中的优异表现,其性能可与NVIDIA GPU相媲美,为开发者提供了另一种高效的AI计算平台选择。通过mindyolo开源仓库,其他开发者也可以复现这一成果并进行进一步的开发和优化。我们在昇腾310AI加速卡上使用昇思MindSpore把YOLOv8模型的NPU训练和推理给跑通了,性能不输于NVIDIA的GPU。OrangePi AI Stuido Pro与Atlas 300V Pro视频解析卡搭载是同款Ascend 310p芯片,总共是两块,每块有96G的内存,可以提供176TFlops的训练算力和352Tops的推理算力。上图是在单块NPU上训练yolov8m模型的AICore的利用率以及内存的占用情况,总共7000张图像每轮训练时长仅需6.92分钟:2025-09-24 16:47:11,931 [INFO] 2025-09-24 16:47:11,931 [INFO] Please check the above information for the configurations 2025-09-24 16:47:12,050 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 16:47:12,069 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 16:47:12,184 [INFO] number of network params, total: 25.896391M, trainable: 25.863252M 2025-09-24 16:47:16,786 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 16:47:16,807 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 16:47:16,920 [INFO] number of network params, total: 25.896391M, trainable: 25.863252M 2025-09-24 16:47:31,011 [INFO] ema_weight not exist, default pretrain weight is currently used. 2025-09-24 16:47:31,118 [INFO] Dataset Cache file hash/version check success. 2025-09-24 16:47:31,118 [INFO] Load dataset cache from [/home/orangepi/workspace/mindyolo/examples/finetune_visdrone/train.cache.npy] success. 2025-09-24 16:47:31,142 [INFO] Dataloader num parallel workers: [8] 2025-09-24 16:47:31,240 [INFO] Dataset Cache file hash/version check success. 2025-09-24 16:47:31,240 [INFO] Load dataset cache from [/home/orangepi/workspace/mindyolo/examples/finetune_visdrone/train.cache.npy] success. 2025-09-24 16:47:31,264 [INFO] Dataloader num parallel workers: [8] 2025-09-24 16:47:31,438 [INFO] 2025-09-24 16:47:31,445 [INFO] got 1 active callback as follows: 2025-09-24 16:47:31,445 [INFO] SummaryCallback() 2025-09-24 16:47:31,445 [WARNING] The first epoch will be compiled for the graph, which may take a long time; You can come back later :). 2025-09-24 16:50:38,076 [INFO] Epoch 1/10, Step 100/404, imgsize (640, 640), loss: 6.4507, lbox: 3.8446, lcls: 0.5687, dfl: 2.0375, cur_lr: 0.09257426112890244 2025-09-24 16:50:38,970 [INFO] Epoch 1/10, Step 100/404, step time: 1875.26 ms 2025-09-24 16:52:21,629 [INFO] Epoch 1/10, Step 200/404, imgsize (640, 640), loss: 4.8078, lbox: 3.0080, lcls: 0.4118, dfl: 1.3880, cur_lr: 0.08514851331710815 2025-09-24 16:52:21,653 [INFO] Epoch 1/10, Step 200/404, step time: 1026.83 ms 2025-09-24 16:54:04,347 [INFO] Epoch 1/10, Step 300/404, imgsize (640, 640), loss: 4.0795, lbox: 2.4281, lcls: 0.3466, dfl: 1.3048, cur_lr: 0.07772277295589447 2025-09-24 16:54:04,371 [INFO] Epoch 1/10, Step 300/404, step time: 1027.18 ms 2025-09-24 16:55:47,067 [INFO] Epoch 1/10, Step 400/404, imgsize (640, 640), loss: 3.8245, lbox: 2.1755, lcls: 0.3567, dfl: 1.2923, cur_lr: 0.07029703259468079 2025-09-24 16:55:47,091 [INFO] Epoch 1/10, Step 400/404, step time: 1027.19 ms 2025-09-24 16:55:52,087 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-1_404.ckpt 2025-09-24 16:55:52,087 [INFO] Epoch 1/10, epoch time: 8.34 min. 2025-09-24 16:57:34,759 [INFO] Epoch 2/10, Step 100/404, imgsize (640, 640), loss: 3.8083, lbox: 2.2584, lcls: 0.3404, dfl: 1.2095, cur_lr: 0.062162574380636215 2025-09-24 16:57:34,768 [INFO] Epoch 2/10, Step 100/404, step time: 1026.80 ms 2025-09-24 16:59:17,441 [INFO] Epoch 2/10, Step 200/404, imgsize (640, 640), loss: 3.7835, lbox: 2.2670, lcls: 0.3574, dfl: 1.1592, cur_lr: 0.05465514957904816 2025-09-24 16:59:17,450 [INFO] Epoch 2/10, Step 200/404, step time: 1026.82 ms 2025-09-24 17:01:00,127 [INFO] Epoch 2/10, Step 300/404, imgsize (640, 640), loss: 3.5251, lbox: 2.0144, lcls: 0.3210, dfl: 1.1898, cur_lr: 0.0471477210521698 2025-09-24 17:01:00,136 [INFO] Epoch 2/10, Step 300/404, step time: 1026.85 ms 2025-09-24 17:02:42,826 [INFO] Epoch 2/10, Step 400/404, imgsize (640, 640), loss: 3.5596, lbox: 2.0947, lcls: 0.3086, dfl: 1.1563, cur_lr: 0.03964029625058174 2025-09-24 17:02:42,835 [INFO] Epoch 2/10, Step 400/404, step time: 1026.99 ms 2025-09-24 17:02:47,745 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-2_404.ckpt 2025-09-24 17:02:47,745 [INFO] Epoch 2/10, epoch time: 6.93 min. 2025-09-24 17:04:30,489 [INFO] Epoch 3/10, Step 100/404, imgsize (640, 640), loss: 3.5524, lbox: 2.1004, lcls: 0.2938, dfl: 1.1582, cur_lr: 0.031090890988707542 2025-09-24 17:04:30,497 [INFO] Epoch 3/10, Step 100/404, step time: 1027.52 ms 2025-09-24 17:06:13,196 [INFO] Epoch 3/10, Step 200/404, imgsize (640, 640), loss: 3.8549, lbox: 2.2845, lcls: 0.3526, dfl: 1.2178, cur_lr: 0.02350178174674511 2025-09-24 17:06:13,205 [INFO] Epoch 3/10, Step 200/404, step time: 1027.07 ms 2025-09-24 17:07:55,875 [INFO] Epoch 3/10, Step 300/404, imgsize (640, 640), loss: 3.6236, lbox: 2.1016, lcls: 0.3113, dfl: 1.2106, cur_lr: 0.015912672504782677 2025-09-24 17:07:55,883 [INFO] Epoch 3/10, Step 300/404, step time: 1026.78 ms 2025-09-24 17:09:38,572 [INFO] Epoch 3/10, Step 400/404, imgsize (640, 640), loss: 3.5586, lbox: 2.0730, lcls: 0.3314, dfl: 1.1542, cur_lr: 0.008323564194142818 2025-09-24 17:09:38,581 [INFO] Epoch 3/10, Step 400/404, step time: 1026.97 ms 2025-09-24 17:09:43,528 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-3_404.ckpt 2025-09-24 17:09:43,529 [INFO] Epoch 3/10, epoch time: 6.93 min. 2025-09-24 17:11:26,211 [INFO] Epoch 4/10, Step 100/404, imgsize (640, 640), loss: 3.3767, lbox: 1.9760, lcls: 0.2928, dfl: 1.1079, cur_lr: 0.007029999978840351 2025-09-24 17:11:26,218 [INFO] Epoch 4/10, Step 100/404, step time: 1026.90 ms 2025-09-24 17:13:08,899 [INFO] Epoch 4/10, Step 200/404, imgsize (640, 640), loss: 3.4213, lbox: 1.9382, lcls: 0.3052, dfl: 1.1779, cur_lr: 0.007029999978840351 2025-09-24 17:13:08,908 [INFO] Epoch 4/10, Step 200/404, step time: 1026.89 ms 2025-09-24 17:14:51,583 [INFO] Epoch 4/10, Step 300/404, imgsize (640, 640), loss: 2.8313, lbox: 1.5666, lcls: 0.2380, dfl: 1.0267, cur_lr: 0.007029999978840351 2025-09-24 17:14:51,591 [INFO] Epoch 4/10, Step 300/404, step time: 1026.83 ms 2025-09-24 17:16:34,277 [INFO] Epoch 4/10, Step 400/404, imgsize (640, 640), loss: 3.2905, lbox: 1.9274, lcls: 0.2889, dfl: 1.0741, cur_lr: 0.007029999978840351 2025-09-24 17:16:34,285 [INFO] Epoch 4/10, Step 400/404, step time: 1026.94 ms 2025-09-24 17:16:39,232 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-4_404.ckpt 2025-09-24 17:16:39,232 [INFO] Epoch 4/10, epoch time: 6.93 min. 2025-09-24 17:18:21,892 [INFO] Epoch 5/10, Step 100/404, imgsize (640, 640), loss: 3.1534, lbox: 1.7844, lcls: 0.2581, dfl: 1.1109, cur_lr: 0.006039999891072512 2025-09-24 17:18:21,900 [INFO] Epoch 5/10, Step 100/404, step time: 1026.67 ms 2025-09-24 17:20:04,596 [INFO] Epoch 5/10, Step 200/404, imgsize (640, 640), loss: 3.1152, lbox: 1.7685, lcls: 0.2518, dfl: 1.0949, cur_lr: 0.006039999891072512 2025-09-24 17:20:04,604 [INFO] Epoch 5/10, Step 200/404, step time: 1027.04 ms 2025-09-24 17:21:47,284 [INFO] Epoch 5/10, Step 300/404, imgsize (640, 640), loss: 3.3179, lbox: 1.8412, lcls: 0.2888, dfl: 1.1880, cur_lr: 0.006039999891072512 2025-09-24 17:21:47,292 [INFO] Epoch 5/10, Step 300/404, step time: 1026.88 ms 2025-09-24 17:23:29,968 [INFO] Epoch 5/10, Step 400/404, imgsize (640, 640), loss: 3.2193, lbox: 1.8366, lcls: 0.2620, dfl: 1.1207, cur_lr: 0.006039999891072512 2025-09-24 17:23:29,976 [INFO] Epoch 5/10, Step 400/404, step time: 1026.84 ms 2025-09-24 17:23:34,954 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-5_404.ckpt 2025-09-24 17:23:34,954 [INFO] Epoch 5/10, epoch time: 6.93 min. 2025-09-24 17:25:17,530 [INFO] Epoch 6/10, Step 100/404, imgsize (640, 640), loss: 2.7642, lbox: 1.5834, lcls: 0.2164, dfl: 0.9643, cur_lr: 0.005049999803304672 2025-09-24 17:25:17,538 [INFO] Epoch 6/10, Step 100/404, step time: 1025.84 ms 2025-09-24 17:27:00,125 [INFO] Epoch 6/10, Step 200/404, imgsize (640, 640), loss: 2.6854, lbox: 1.4272, lcls: 0.2080, dfl: 1.0502, cur_lr: 0.005049999803304672 2025-09-24 17:27:00,134 [INFO] Epoch 6/10, Step 200/404, step time: 1025.96 ms 2025-09-24 17:28:42,720 [INFO] Epoch 6/10, Step 300/404, imgsize (640, 640), loss: 2.7541, lbox: 1.5028, lcls: 0.2171, dfl: 1.0342, cur_lr: 0.005049999803304672 2025-09-24 17:28:42,728 [INFO] Epoch 6/10, Step 300/404, step time: 1025.94 ms 2025-09-24 17:30:25,315 [INFO] Epoch 6/10, Step 400/404, imgsize (640, 640), loss: 2.8092, lbox: 1.5545, lcls: 0.2121, dfl: 1.0427, cur_lr: 0.005049999803304672 2025-09-24 17:30:25,323 [INFO] Epoch 6/10, Step 400/404, step time: 1025.95 ms 2025-09-24 17:30:30,293 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-6_404.ckpt 2025-09-24 17:30:30,294 [INFO] Epoch 6/10, epoch time: 6.92 min. 2025-09-24 17:32:12,881 [INFO] Epoch 7/10, Step 100/404, imgsize (640, 640), loss: 3.0997, lbox: 1.8226, lcls: 0.2402, dfl: 1.0369, cur_lr: 0.00406000018119812 2025-09-24 17:32:12,890 [INFO] Epoch 7/10, Step 100/404, step time: 1025.96 ms 2025-09-24 17:33:55,477 [INFO] Epoch 7/10, Step 200/404, imgsize (640, 640), loss: 2.8140, lbox: 1.5979, lcls: 0.2143, dfl: 1.0018, cur_lr: 0.00406000018119812 2025-09-24 17:33:55,485 [INFO] Epoch 7/10, Step 200/404, step time: 1025.96 ms 2025-09-24 17:35:38,072 [INFO] Epoch 7/10, Step 300/404, imgsize (640, 640), loss: 3.0294, lbox: 1.6439, lcls: 0.2544, dfl: 1.1310, cur_lr: 0.00406000018119812 2025-09-24 17:35:38,081 [INFO] Epoch 7/10, Step 300/404, step time: 1025.95 ms 2025-09-24 17:37:20,660 [INFO] Epoch 7/10, Step 400/404, imgsize (640, 640), loss: 2.8015, lbox: 1.5686, lcls: 0.2252, dfl: 1.0077, cur_lr: 0.00406000018119812 2025-09-24 17:37:20,669 [INFO] Epoch 7/10, Step 400/404, step time: 1025.88 ms 2025-09-24 17:37:25,643 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-7_404.ckpt 2025-09-24 17:37:25,644 [INFO] Epoch 7/10, epoch time: 6.92 min. 2025-09-24 17:39:08,227 [INFO] Epoch 8/10, Step 100/404, imgsize (640, 640), loss: 2.5091, lbox: 1.3373, lcls: 0.1711, dfl: 1.0007, cur_lr: 0.0030700000934302807 2025-09-24 17:39:08,236 [INFO] Epoch 8/10, Step 100/404, step time: 1025.92 ms 2025-09-24 17:40:50,818 [INFO] Epoch 8/10, Step 200/404, imgsize (640, 640), loss: 2.5926, lbox: 1.4141, lcls: 0.1923, dfl: 0.9863, cur_lr: 0.0030700000934302807 2025-09-24 17:40:50,826 [INFO] Epoch 8/10, Step 200/404, step time: 1025.91 ms 2025-09-24 17:42:33,392 [INFO] Epoch 8/10, Step 300/404, imgsize (640, 640), loss: 2.5341, lbox: 1.3811, lcls: 0.1869, dfl: 0.9660, cur_lr: 0.0030700000934302807 2025-09-24 17:42:33,400 [INFO] Epoch 8/10, Step 300/404, step time: 1025.74 ms 2025-09-24 17:44:15,994 [INFO] Epoch 8/10, Step 400/404, imgsize (640, 640), loss: 3.0024, lbox: 1.6379, lcls: 0.2284, dfl: 1.1361, cur_lr: 0.0030700000934302807 2025-09-24 17:44:16,002 [INFO] Epoch 8/10, Step 400/404, step time: 1026.02 ms 2025-09-24 17:44:20,974 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-8_404.ckpt 2025-09-24 17:44:20,975 [INFO] Epoch 8/10, epoch time: 6.92 min. 2025-09-24 17:46:03,561 [INFO] Epoch 9/10, Step 100/404, imgsize (640, 640), loss: 3.0890, lbox: 1.8395, lcls: 0.2321, dfl: 1.0174, cur_lr: 0.0020800000056624413 2025-09-24 17:46:03,569 [INFO] Epoch 9/10, Step 100/404, step time: 1025.94 ms 2025-09-24 17:47:46,157 [INFO] Epoch 9/10, Step 200/404, imgsize (640, 640), loss: 2.9621, lbox: 1.6608, lcls: 0.2360, dfl: 1.0652, cur_lr: 0.0020800000056624413 2025-09-24 17:47:46,166 [INFO] Epoch 9/10, Step 200/404, step time: 1025.96 ms 2025-09-24 17:49:28,755 [INFO] Epoch 9/10, Step 300/404, imgsize (640, 640), loss: 2.4801, lbox: 1.3320, lcls: 0.1753, dfl: 0.9728, cur_lr: 0.0020800000056624413 2025-09-24 17:49:28,763 [INFO] Epoch 9/10, Step 300/404, step time: 1025.97 ms 2025-09-24 17:51:11,359 [INFO] Epoch 9/10, Step 400/404, imgsize (640, 640), loss: 2.8075, lbox: 1.5971, lcls: 0.1995, dfl: 1.0109, cur_lr: 0.0020800000056624413 2025-09-24 17:51:11,367 [INFO] Epoch 9/10, Step 400/404, step time: 1026.03 ms 2025-09-24 17:51:16,330 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-9_404.ckpt 2025-09-24 17:51:16,331 [INFO] Epoch 9/10, epoch time: 6.92 min. 2025-09-24 17:52:58,913 [INFO] Epoch 10/10, Step 100/404, imgsize (640, 640), loss: 2.6278, lbox: 1.4529, lcls: 0.1860, dfl: 0.9889, cur_lr: 0.0010900000343099236 2025-09-24 17:52:58,921 [INFO] Epoch 10/10, Step 100/404, step time: 1025.90 ms 2025-09-24 17:54:41,521 [INFO] Epoch 10/10, Step 200/404, imgsize (640, 640), loss: 2.7550, lbox: 1.5724, lcls: 0.2083, dfl: 0.9742, cur_lr: 0.0010900000343099236 2025-09-24 17:54:41,529 [INFO] Epoch 10/10, Step 200/404, step time: 1026.08 ms 2025-09-24 17:56:24,125 [INFO] Epoch 10/10, Step 300/404, imgsize (640, 640), loss: 2.4470, lbox: 1.2448, lcls: 0.1758, dfl: 1.0263, cur_lr: 0.0010900000343099236 2025-09-24 17:56:24,133 [INFO] Epoch 10/10, Step 300/404, step time: 1026.03 ms 2025-09-24 17:58:06,727 [INFO] Epoch 10/10, Step 400/404, imgsize (640, 640), loss: 2.5783, lbox: 1.3733, lcls: 0.1848, dfl: 1.0202, cur_lr: 0.0010900000343099236 2025-09-24 17:58:06,736 [INFO] Epoch 10/10, Step 400/404, step time: 1026.02 ms 2025-09-24 17:58:11,744 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-10_404.ckpt 2025-09-24 17:58:11,745 [INFO] Epoch 10/10, epoch time: 6.92 min. 2025-09-24 17:58:12,149 [INFO] End Train. 2025-09-24 17:58:12,561 [INFO] Training completed.以下是模型训练了10个epoch的使用NPU在测试集图片上的推理结果:2025-09-24 18:13:24,511 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 18:13:24,532 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 18:13:24,639 [INFO] number of network params, total: 25.896391M, trainable: 25.863252M 2025-09-24 18:13:29,405 [INFO] Load checkpoint from [/home/orangepi/workspace/mindyolo/runs/2025.09.24-16.47.11/weights/yolov8m-10_404.ckpt] success. 2025-09-24 18:13:53,915 [INFO] Predict result is: {'category_id': [4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 4, 4, 5, 10, 4, 1, 4, 2, 4, 1, 5, 10, 4, 2, 4, 1], 'bbox': [[866.402, 359.922, 125.209, 179.961], [619.836, 379.246, 140.848, 229.434], [704.238, 192.678, 102.631, 112.359], [572.588, 189.689, 108.707, 103.76], [80.484, 471.75, 334.953, 243.844], [739.99, 15.987, 60.305, 60.944], [1179.242, 68.017, 143.637, 56.163], [1220.215, 154.843, 138.523, 76.782], [1217.559, 108.026, 140.516, 63.733], [822.475, 15.34, 56.744, 75.039], [621.438, 70.781, 19.938, 55.292], [1106.859, 128.463, 79.986, 95.99], [773.168, 90.047, 71.42, 95.293], [773.467, 88.951, 70.988, 95.924], [1122.158, 371.145, 48.12, 90.512], [1168.982, 2.274, 83.141, 77.081], [723.45, 65.277, 21.877, 51.017], [1145.906, 0.556, 76.467, 46.708], [672.513, 71.818, 25.857, 46.933], [488.816, 350.559, 107.844, 117.605], [672.778, 71.918, 26.172, 48.194], [1106.826, 128.612, 79.621, 96.239], [1058.831, 319.314, 35.087, 75.056], [1146.62, 0.365, 54.586, 48.643], [1124.963, 370.945, 42.359, 66.473], [1148.197, 1.046, 92.537, 51.581], [526.153, 87.349, 29.123, 37.91]], 'score': [0.93223, 0.92336, 0.90671, 0.90539, 0.84414, 0.83682, 0.83292, 0.75641, 0.74857, 0.74295, 0.72221, 0.63341, 0.62439, 0.5829, 0.50411, 0.48259, 0.42391, 0.42188, 0.42185, 0.36533, 0.29963, 0.29451, 0.29264, 0.28265, 0.26525, 0.2585, 0.25038]} 2025-09-24 18:13:53,915 [INFO] Speed: 24481.6/5.7/24487.3 ms inference/NMS/total per 640x640 image at batch-size 1; 2025-09-24 18:13:53,915 [INFO] Detect a image success. 2025-09-24 18:13:53,924 [INFO] Infer completed.模型训练和推理代码可以从mindyolo仓库上下载:https://github.com/mindspore-lab/mindyolo
  • [技术干货] 如何在OrangePi Studio Pro上升级CANN以及的Pytorch和MindSpore
    如何在OrangePi Studio Pro上升级CANN以及的Pytorch和MindSpore1. 安装 CANN 和 Pytorch首先我们在昇腾资源下载中心硬件信息中产品系列选择:加速卡,产品型号选择:Atlas 300V Pro 视频解析卡,CANN版本选择:8.2.RC1,下载CANN相关软件包,获取Pytorch源码。下载完成后,就安装CANN以及Pytorch了,我使用的OrangePi制作的预装好AI环境的Ubuntu22.04测试镜像,因此只需要升级Ascend-cann-toolkit_8.2.RC1_linux-x86_64.run和Ascend-cann-kernels-310p_8.2.RC1_linux-x86_64.run以及torch_npu-2.1.0.post13-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl。首先我们切换到root用户安装更新依赖包列表安装g++-12:sudo apt update sudo apt install -y g++-12之后进入CANN软件包下载目录,依次执行下面的命令进行安装:chmod +x ./Ascend-cann-toolkit_8.2.RC1_linux-x86_64.run ./Ascend-cann-toolkit_8.2.RC1_linux-x86_64.run --full --quiet chmod +x ./Ascend-cann-kernels-310p_8.2.RC1_linux-x86_64.run ./Ascend-cann-kernels-310p_8.2.RC1_linux-x86_64.run --install --quiet pip3 install torch_npu-2.1.0.post13-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl执行如下命令,验证是cann和torch_npu是否安装成功:source /usr/local/Ascend/ascend-toolkit/set_env.sh python3 -c "import torch;import torch_npu; a = torch.randn(3, 4).npu(); print(a + a);" 2. 升级 MindSpore 版本我们访问MindSpore官网,CANN版本选择我们刚刚安装的CANN 8.2.RC1,其他配置根据自己的设备选择:切换到root用户执行如下安装命令:sudo su pip3 install mindspore==2.7.0 -i https://repo.mindspore.cn/pypi/simple --trusted-host repo.mindspore.cn --extra-index-url https://repo.huaweicloud.com/repository/pypi/simple安装完成后我们可以执行如下验证命令测试是否安装成功:source /usr/local/Ascend/ascend-toolkit/set_env.sh python3 -c "import mindspore;mindspore.set_context(device_target='Ascend');mindspore.run_check()" 如果输出下面的结果就证明 MindSpore 安装成功了![WARNING] ME(1621400:139701939115840,MainProcess):2025-09-24-10:46:21.978.000 [mindspore/context.py:1412] For 'context.set_context', the parameter 'device_target' will be deprecated and removed in a future version. Please use the api mindspore.set_device() instead. MindSpore version: 2.7.0 [WARNING] GE_ADPT(1621400,7f0e18710640,python3):2025-09-24-10:46:23.323.570 [mindspore/ops/kernel/ascend/acl_ir/op_api_exec.cc:169] GetAscendDefaultCustomPath] Checking whether the so exists or if permission to access it is available: /usr/local/Ascend/ascend-toolkit/latest/opp/vendors/customize_vision/op_api/lib/libcust_opapi.so The result of multiplication calculation is correct, MindSpore has been installed on platform [Ascend] successfully! 3. 小结本文详细介绍了在OrangePi Studio Pro开发板上升级CANN、PyTorch和MindSpore AI框架的完整流程。通过本文的指导,开发者可以轻松地将这些关键的AI组件升级到最新版本,从而充分发挥OrangePi Studio Pro硬件平台的AI计算能力。
  • [问题求助] torch 和 torch_npu 无法正常 import
    镜像:mindspore_2.6.0rc1-cann_8.1.rc1-py_3.10-euler_2.10.11-aarch64-snt9b修改:手动安装 cann 8.2.rc1,以及 MindSpeed-LLM 要求的 torch 2.6.0 和 torch_npu。报错如下:>>> import torchTraceback (most recent call last):  File "/home/ma-user/miniforge3/envs/dnallm/lib/python3.10/site-packages/torch/__init__.py", line 2756, in _import_device_backends    entrypoint = backend_extension.load()  File "/home/ma-user/miniforge3/envs/dnallm/lib/python3.10/importlib/metadata/__init__.py", line 171, in load    module = import_module(match.group('module'))  File "/home/ma-user/miniforge3/envs/dnallm/lib/python3.10/importlib/__init__.py", line 126, in import_module    return _bootstrap._gcd_import(name[level:], package, level)  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked  File "<frozen importlib._bootstrap_external>", line 883, in exec_module  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed  File "/home/ma-user/miniforge3/envs/dnallm/lib/python3.10/site-packages/torch_npu/__init__.py", line 41, in <module>    import torch_npu.npu  File "/home/ma-user/miniforge3/envs/dnallm/lib/python3.10/site-packages/torch_npu/npu/__init__.py", line 476, in <module>    from .memory import *  # noqa: F403  File "/home/ma-user/miniforge3/envs/dnallm/lib/python3.10/site-packages/torch_npu/npu/memory.py", line 16, in <module>    from ._memory_viz import memory as _memory, segments as _segments  File "/home/ma-user/miniforge3/envs/dnallm/lib/python3.10/site-packages/torch_npu/npu/_memory_viz.py", line 11, in <module>    import yamlModuleNotFoundError: No module named 'yaml'The above exception was the direct cause of the following exception:Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "/home/ma-user/miniforge3/envs/dnallm/lib/python3.10/site-packages/torch/__init__.py", line 2784, in <module>    _import_device_backends()  File "/home/ma-user/miniforge3/envs/dnallm/lib/python3.10/site-packages/torch/__init__.py", line 2760, in _import_device_backends    raise RuntimeError(RuntimeError: Failed to load the backend extension: torch_npu. You can disable extension auto-loading with TORCH_DEVICE_BACKEND_AUTOLOAD=0. 如果手动安装 conda install yaml pyyaml,则报错变为:>>> import torchSegmentation fault
  • [分享交流] X+AI驱动下的教育革新与产教融合实践
     本期直播聚焦X+AI驱动下的教育革新与产教融合实践,吸引超过7200人次的在线观看,累计社媒播放量突破6000+,本期企业开发者占比56%。 
  • [活动公告] 直播 | DeepSeek+香橙派 AI pro:模型部署、调优及未来发展的全景视图
    5月30日19:00,MindSpore直播间不见不散!议题:DeepSeek+香橙派AI Pro:模型部署、调优及未来发展的全景视图嘉宾:陈新杰 华为开发者布道师、昇思MindSpore开发者布道师议题介绍:本议题将深入探讨如何在香橙派AI Pro上高效部署DeepSeek-R1-Distill-Qwen-1.5B模型,包括环境准备、模型获取、代码配置和运行测试等关键步骤,分享从零开始到成功部署的全过程。同时,将介绍如何利用魔乐社区(Modelers)获取模型以及优化,提升模型的性能和生成质量,助力开发者快速上手并应用这一强大组合。另外将基于梅科尔工作室项目和生态实践对DeepSeek模型和香橙派AI Pro的未来发展进行展望,探讨可能的技术方向、应用场景和市场潜力。Call for Demo欢迎大家参加,参与即有机会赢取MateBook X Pro、Mate 70等激励!了解详情:https://xihe.mindspore.cn/competition/call-for-demo/0/introduction
  • [问题求助] Atlas 200I DK A2 支持的机器学习框架
    目前正在尝试使用pytorch或者mindspore框架,但是查看文档发现只支持atlas训练系列,后面我又在昇腾开发资源下载中心内找到了该型号的板子支持的资源,包括了pytorch及一些nnrt,nnae资源,并提供了torch_npu的下载链接,所以感觉又是能支持pytorch。请问ascend 310b这个型号的板子能跑pytorch这种机器学习框架吗?能用pytorch做训练和在线推理吗?或者原生的acl支持训练功能吗?
  • [问题求助] Atlas 200DK A2开发者套件机器学习框架支持
    目前正在尝试使用pytorch或者mindspore框架,但是查看文档发现只支持atlas训练系列,后面我又在昇腾开发资源下载中心内找到了该型号的板子支持的资源,包括了pytorch及一些nnrt,nnae资源,并提供了torch_npu的下载链接,所以感觉又是能支持pytorch。请问ascend 310b这个型号的板子能跑pytorch这种机器学习框架吗?能用pytorch做训练和在线推理吗?或者原生的acl支持训练功能吗?
  • [问题求助] mindspore按照教程安装后无法正常调用GPU
    我在按照教程安装完成mindspore后调用命令python -c "import mindspore;mindspore.set_device(device_target='GPU');mindspore.run_check()"后报错[ERROR] ME(8369:140203682788288,MainProcess):2025-01-25-11:34:55.376.905 [mindspore/run_check/_check_version.py:219] libcuda.so (need by mindspore-gpu) is not found. Please confirm that libmindspore_gpu.so is in directory:/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindspore/run_check/../lib/plugin and the correct cuda version has been installed, you can refer to the installation guidelines: https://www.mindspore.cn/install [ERROR] ME(8369:140203682788288,MainProcess):2025-01-25-11:34:55.377.152 [mindspore/run_check/_check_version.py:219] libcudnn.so (need by mindspore-gpu) is not found. Please confirm that libmindspore_gpu.so is in directory:/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindspore/run_check/../lib/plugin and the correct cuda version has been installed, you can refer to the installation guidelines: https://www.mindspore.cn/install Traceback (most recent call last): File "<string>", line 1, in <module> File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindspore/_checkparam.py", line 1367, in wrapper return func(*args, **kwargs) File "/root/miniconda3/envs/mindspore/lib/python3.9/site-packages/mindspore/device_manager.py", line 64, in set_device MSContext.get_instance().set_device_target_inner(device_target) RuntimeError: Unsupported device target GPU. This process only supports one of the ['CPU']. Please check whether the GPU environment is installed and configured correctly, and check whether current mindspore wheel package was built with "-e GPU". For details, please refer to "Device load error message". ---------------------------------------------------- - Device load error message: ---------------------------------------------------- Load dynamic library: libmindspore_ascend.so.2 failed. libge_runner.so: cannot open shared object file: No such file or directory ---------------------------------------------------- - C++ Call Stack: (For framework developers) ---------------------------------------------------- mindspore/core/utils/ms_context.cc:281 SetDeviceTargetFromInner上面提示GPU环境不正确, 问题是我的环境已经配置好了,bashrc中内容如下:cuda也安装完成python版本和ubuntu版本也是按照教程的要求python3.9和ubuntu18.04求解怎么回事
  • [分享交流] J市JJ银行合规模型金融一体化应用平台实战“术”分享-Python迁移心得
           上回我们谈到银行去年更换了GCH平台,换成鲲鹏平台,随之上层Saas应用全部都要开始适配鲲鹏平台。除了上回谈的模型参数配置以外,还有一个代码迁移的问题需要解决。合规模型大部分代码是Python和Java,这两部分代码原有架构是X86架构。Java调用了很多SO库和JDK,而Python也安装了GCC和SO库。Python代码迁移分为两部分,GCC重新安装和SO库重新编译,同时还会要重新安装Maveen软件。       Python大量SO库调用,代码经过几个人维护,已经找不到调用SO库的文档记录,如果少迁移一个SO库,代码都会跑不起来。这里我借助了一个代码检查工具——porting advisor,这个工具可以扫描出迁移目录下代码调用了哪些SO库。        这个项目的Pyhton量不大,主要是合同的智能识别和智能比对的模块,在合规模型的代码量中占比20%,但功能点却是整个合规模型的核心功能和亮点。       通过porting advisor扫描,模型代码总共调用了55个SO库。找出这些SO库后,就会出现迁移过程的第二道难关——SO存放在哪个仓库中?       鲲鹏平台对仓库调用有一个顺序,本地,远程,中心仓,由近到远的顺序进行仓库搜索。所以,找出55个SO库后还要进行SO库排序,按照频率调用的高地,调用频率高的SO库存在本地仓,调用频率低的SO库存在中心仓。经过排序后,每个SO库代码在鲲鹏平台重新编译。
  • [技术干货] 华为布道师中国人民大学站ICT及华为云技术布道
    2024年11月22日,华为开发者布道师于人民大学站开展布道师华为ICT创新赛和编程赛技术案例分享技术布道案例分享,并在分享中对相关疑问分析解答。首先介绍了基于鸿蒙生态的低光照增强安全辅助驾驶系统:低光照增强安全辅助驾驶系统以华为云作为训练的平台,以MindSpore作为框架,通过整合红外相机与可见光相机的特征,利用特征互补模型在软件平台控制下进行数据处理。系统采用Hispark IPC通讯协议实现各设备间的通信,包括红外相机、可见光相机、深度相机以及车辆显示屏等,这些设备采集的数据会上传到华为云进行进一步的处理和存储。在华为云的平台上,智能决策单元和传感器单元协同工作,收集并分析环境信息,同时应用云端算力对算法模型进行部署以优化系统性能。系统还具备低光照工作能力,能够实时显示感知数据,并在必要时进行决策预警,确保驾驶安全。此外,系统还展示了与Harmony OS生态的兼容,如手表等设备的联动提醒,以及红外点阵投影等高级功能,为驾驶者提供全面的安全辅助。并介绍了技术方案所涉及的关键技术:(1)传感器标定与图像配准技术:应用传感器标定与图像配准技术,对传感器采集到的图像数据进行进一步的配准以及标定。其中,本技术包括了双目视觉系统、交通目标识别方法、深度迭代配准算法来实现图像标定以及配准。其中,双目视觉系统基于深度迭代的配准方法主要使用CNN进行特征的提取或者代替传统配准算法中相似性度量的计算函数。使用深度学习网络对输入的图像对进行特征提取与相似性测度,能够实现配准图像的生成与图像配准精度的判别。在实际检测中,大大降低了系统的计算时间,提高了图像配准以及标定的精度。(2)图像特征融合感知技术在图像处理过程中还应用到了图像特征融合感知技术。在对标定以及配准好的图像进行特征融合的过程中,项目团队通过Mindspore训练的AI神经运算专用芯片,运用多源数据融合感知技术与交通安全隐患的智能检测从而识别不同模型,实现多源数据的精确融合。最终,在这些技术的应用下,通过不同类型相机采集到的图像特征进行融合感知,在低光照等不同恶劣环境(3)辅助决策预警与风险评估技术在辅助决策以及预警层面应用HiSpark-WiFi-IoT套件为主的智能决策模块来进行实现。通过紧急避障系统中搭建的感知层、规划层、决策层实现安全隐患智能评定与决策预警的相关功能,为显示交互单元提供安全数据信息。在辅助决策预警与风险评估的功能中,终端显示给用户也是十分关键的。所以在显示与交互的单元中开发了搭载Harmony OS的显示终端、多频报警音响等显示报警装置,运行配套开发的鸿蒙APP,实时显示感知数据与决策预警判定结果,可以在低光照、雨雪雾霾等全天候全场景工况下辅助驾驶人员进行判断,提升驾驶安全性与可靠性。接着介绍了基于华为云的制造业咨询服务微调问答助手案例:构建了私有的工业中文知识库问答系统,旨在为用户提供自定义友好、离线可运行的全面解决方案。​我们的系统的亮点包括:多模态交互。支持文本输入、语音输入和文件上传等多种输入方式,实现了多模态交互。用户可以根据实际需求选择最方便的交互方式,提高了系统的灵活性和适用性。同时,多模态交互也考虑到了用户的个体差异,满足了不同用户的使用习惯和需求。多功能对话界面。对话界面不仅支持基本的文本对话,还提供了多会话管理、对话模式切换和语音输入等功能。用户可以方便地切换不同对话会话,选择不同对话模式,以及通过语音输入进行自然交互。这种多功能对话界面提升了用户体验,使得系统更易于使用和操作。权限管理与安全性。系统引入了权限管理功能,通过登录注册和权限认证,确保只有经过授权的用户才能访问和操作知识库。这一功能提高了系统的安全性,防止未经授权的访问和操作,保护敏感信息免受泄露和篡改。知识库管理。系统提供了完整的知识库管理功能,包括知识库创建、文件上传、向量数据库构建、文件检索等。用户可以方便地管理知识库中的信息,通过文件对话和知识库问答等方式进行信息检索,满足不同场景和需求。检索增强的大模型。系统整合了检索增强的大模型,包括文件对话、知识库问答、搜索引擎问答和自定义Agent问答等功能。通过这些功能,系统可以更全面、准确地回答用户的问题,提供更丰富的信息服务。同时,用户可以选择不同的检索方式,根据实际需求获取最合适的答案。基于华为云ModelArts、企业级华为云主机的详细解决方案如下所示:首先介绍使用智能问答助手连接企业知识库实时获取信息。在权限管理方面,智能问答助手上运行着基于用户认证机制,利用安全框架,对用户进行身份验证,确保只有授权用户才能访问知识库;使用加密协议上传用户信息到云端服务中;使用专用SDK与API能够方便地对用户权限进行控制与管理。在系统实现方面,我们使用了微服务架构通过streamlit-authenticator模块对用户进行身份验证;并将认证结果通过后端服务传输到前端界面进行展示;前端界面使用HTML、CSS和JavaScript技术,通过发送请求获得认证结果,使用WebSocket技术实时更新用户界面。其次,介绍整合多模态对话输入技术,实现了与用户的高效交互。它支持传统的文本输入方式,同时引入了先进的语音输入识别技术,能够准确捕捉并实时转换用户的语音指令为文字,以便进行后续处理。这一语音输入功能的实现,依托于Streamlit和Bokeh库,通过在用户界面上设置一个专门的按钮,用户点击后即可激活语音识别。系统内部,js_on_event事件监听器与webkitSpeechRecognition对象协同工作,确保语音识别的准确性和实时性。此外,Omniind还提供了多种对话模板供用户选择,以适应不同咨询场景的需求,从而提升企业咨询服务的质量和效率。最后说明问答助手在知识库构建方面,预先构建了一个事实对应的知识库,用于存储和组织信息。当用户提出问题时,系统会进行问题解析,然后通过知识检索在知识库中寻找相关信息,最终生成答案。在实现过程中,助手采用了知识库构建、问题解析、知识检索和答案生成等一系列步骤,避免推理跳跃问题,确保了回答的准确性和相关性。为了验证系统的有效性,进行了功能测试。测试中,基于自定义数据集创建了Digital Twin领域的知识库,并提出了一系列问题,观察系统是否能够正确检索知识库并给出相关答案。此外,系统还提供了一个用户界面,用户可以通过这个界面与知识库进行交互,提出问题并获取答案。这个界面设计直观,易于使用,使得用户能够方便地获取所需的信息。通过这种方式,Omniind智能问答助手不仅提高了信息检索的效率,也提升了用户获取知识的体验。本次项目展示是华为开发者布道师​首次对ICT实战技术案例及华为云企业级边云部署行业解决方案进行高校技术布道,希望后续能够带给大家更多具有行业价值和实践意义的布道案例 。欢迎大家加入华为开发者布道师的大家庭,成为优秀的华为云开发者!
  • [技术干货] ModelArts_AI开发平台之交通信号标志检测与识别
    【任务背景】近年来,随着人工智能和传感器技术的快速进步,自动驾驶汽车技术也取得了长足发展。从最初的辅助驾驶系统到如今可实现完全自主驾驶的汽车,自动驾驶技术正在改变我们的出行方式。其中,交通信号识别技术作为自动驾驶系统的核心功能之一,能够准确感知和识别道路上的各类交通标志、信号灯等,为车辆提供精准的行驶决策依据,确保行车安全性和效率。基于视觉的交通信号标志检测与识别任务旨在开发出精准高效的识别算法,助力自动驾驶汽车技术的进步。任务包含步行、非机动车行驶、环岛行驶、机动车行驶等多种交通信号标志类别,开发者会利用到计算机视觉领域的目标检测、关键点检测、图像分类等基础技术,为提升识别效率,也可以拓展使用剪枝、量化、蒸馏等模型方面的优化技术。ModelArts云平台提供任务相关的数据处理、模型训练、应用部署等技术文档及学习课程材料,助力开发者学习相关技术,了解实践操作。【数据说明】本次任务需要识别以下16种类别的交通标识,类别编号与定义如下表:i1i2i3i4Walk步行Non_motorized vehicles非机动车行驶Round the island 环岛行驶Motor vehicle 机动车行驶i5i6i7i8Keep on the right side of the road靠右侧道路行驶Keep on the left side of the road 靠左侧道路行驶Drive straight and turn right at the grade separation 立体交叉直行和右转弯行驶Drive straight and turn left 立体交叉直行和左转弯行驶i9i10i11i12Honk 鸣喇叭Turn right 向右转弯Turn left and right 向左向右转弯Turn left 向左转弯i13i14i15il50One way, straight 直行Go straight and turn right 直行和向右转弯Go straight and turn left 直行和向左转弯Minimum Speed Limit 最低限速50【任务说明】本任务的测试集为200张图片,这些图片为中国多个城市街景的高分辨率图像,每张图片包含1种交通信号标识。数据集图片示例检测结果以图片为单位输出其中包括交通信号标识的类别、坐标与置信度,每张图片输出一个结果,格式为json字符串,字段说明如下: { "detection_classes": ["i1"], "detection_boxes": [[576, 423, 669, 967]], "detection_scores": [0.4796633720397949] }• detection_classes:指图片中目标的类别,参见上面的数据说明。• detection_boxes:指图片中目标位置的水平矩形框坐标,坐标表示为[ymin,xmin,ymax,xmax]。• detection_scores:指检测结果的置信度。【案例教学】【准备数据集】# 步骤1:从OBS迁移数据集 import moxing as mox mox.file.copy_parallel("obs://trafficbuckets/dataset/update_traffic_sign.zip", "/home/ma-user/work/dataset/update_traffic_sign.zip") # 形参:源文件->指定位置# 步骤2:解压数据集并更改名称为update_traffic_sign import os os.system('unzip /home/ma-user/work/download/update_traffic_sign.zip -d /home/ma-user/work/dataset') os.system('mv /home/ma-user/work/dataset/"update traffic sign" /home/ma-user/work/dataset/update_traffic_sign')# 生成train和val图片名文本文件 from glob import glob import random # hyper parameter train_pic_rate = 0.7 # you can change! # 该目录存储图片数据 patch_fn_list = glob('/home/ma-user/work/dataset/update_traffic_sign/images/*.jpg') # you can change! # 返回存储图片名的列表,不包含图片的后缀 patch_fn_list = [fn for fn in patch_fn_list] # 将图片打乱顺序 random.shuffle(patch_fn_list) # 按照7:3比例划分train和val train_num = int(train_pic_rate * len(patch_fn_list)) train_patch_list = patch_fn_list[:train_num] valid_patch_list = patch_fn_list[train_num:] # produce train/valid/trainval txt file split = ['train', 'val', 'trainval'] # produce train/valid/trainval txt file split = ['train2017', 'val2017', 'trainval2017'] for s in split: # 存储文本文件的地址 save_path = '/home/ma-user/work/dataset/update_traffic_sign/' + s + '.txt' # you can change! if s == 'train2017': with open(save_path, 'w') as f: for fn in train_patch_list: # 将训练图像的地址写入train.txt文件 f.write('%s\n' % fn) elif s == 'val2017': with open(save_path, 'w') as f: for fn in valid_patch_list: # 将验证图像的地址写入val.txt文件 f.write('%s\n' % fn) elif s == 'trainval2017': with open(save_path, 'w') as f: for fn in patch_fn_list: # 将所有图像名的编号写入trainval.txt文件 f.write('%s\n' % fn) print('Finish Producing %s txt file to %s' % (s, save_path))# 按照train.txt和val.txt将images分类 import shutil import os def my_move(trainlistdir,vallistdir,traindir,valdir): # 打开train.txt文件 fopen = open(trainlistdir, 'r') # 读取图片名称 file_names = fopen.readlines() for file_name in file_names: file_name=file_name.strip('\n') # 图片的路径 traindata = file_name # 把图片复制至traindir路径下 # 如果目标文件夹不存在,则创建它 if not os.path.exists(traindir): os.makedirs(traindir) shutil.move(traindata, traindir) # 同上 fopen = open(vallistdir, 'r') file_names = fopen.readlines() for file_name in file_names: file_name=file_name.strip('\n') valdata = file_name # 如果目标文件夹不存在,则创建它 if not os.path.exists(valdir): os.makedirs(valdir) shutil.move(valdata, valdir) # 存储训练图片名的txt文件地址 trainlistdir=r'/home/ma-user/work/dataset/update_traffic_sign/train2017.txt' # 存储验证图片名的txt文件地址 vallistdir=r'/home/ma-user/work/dataset/update_traffic_sign/val2017.txt' # coco格式数据集的train2017目录 traindir=r'/home/ma-user/work/dataset/update_traffic_sign/images/train2017' # coco格式数据集的val2017目录 valdir=r'/home/ma-user/work/dataset/update_traffic_sign/images/val2017' my_move(trainlistdir,vallistdir,traindir,valdir)# 按照train.txt和val.txt将labels分类 import shutil import os def my_move(datadir, trainlistdir,vallistdir,traindir,valdir): # 打开train.txt文件 fopen = open(trainlistdir, 'r') # 读取图片名称 file_names = fopen.readlines() for file_name in file_names: file_name=file_name.strip('\n') # 图片的路径 tmp_list = file_name.split('/') tmp_list[-2] = 'labels' train_sp = os.path.join('/', *tmp_list) traindata = train_sp.split('.jpg')[-2] + '.txt' # 把图片复制至traindir路径下 # 如果目标文件夹不存在,则创建它 if not os.path.exists(traindir): os.makedirs(traindir) shutil.move(traindata, traindir) # 同上 fopen = open(vallistdir, 'r') file_names = fopen.readlines() for file_name in file_names: file_name=file_name.strip('\n') tmp_list_v = file_name.split('/') tmp_list_v[-2] = 'labels' val_sp = os.path.join('/', *tmp_list_v) valdata = val_sp.split('.jpg')[-2] + '.txt' # 如果目标文件夹不存在,则创建它 if not os.path.exists(valdir): os.makedirs(valdir) shutil.move(valdata, valdir) # labels存储地址 datadir=r'/home/ma-user/work/dataset/update_traffic_sign/labels/' # 存储训练图片名的txt文件地址 trainlistdir=r'/home/ma-user/work/dataset/update_traffic_sign/train2017.txt' # 存储验证图片名的txt文件地址 vallistdir=r'/home/ma-user/work/dataset/update_traffic_sign/val2017.txt' # coco格式数据集的train2017目录 traindir=r'/home/ma-user/work/dataset/update_traffic_sign/labels/train2017' # coco格式数据集的val2017目录 valdir=r'/home/ma-user/work/dataset/update_traffic_sign/labels/val2017' my_move(datadir, trainlistdir,vallistdir,traindir,valdir)# 对images和labels重命名 import os def rename_img_label(images_folder, labels_folder): # 获取文件夹中的文件列表 images = sorted(os.listdir(images_folder)) labels = sorted(os.listdir(labels_folder)) # 重新命名文件 for i, (image_file, label_file) in enumerate(zip(images, labels)): # 生成新的文件名 new_name = f"{i + 1:03d}" # 格式化为三位数 image_ext = os.path.splitext(image_file)[1] # 获取图片扩展名 label_ext = os.path.splitext(label_file)[1] # 获取标签扩展名 # 构建新的完整路径 new_image_path = os.path.join(images_folder, f"{new_name}{image_ext}") new_label_path = os.path.join(labels_folder, f"{new_name}{label_ext}") # 重命名文件 os.rename(os.path.join(images_folder, image_file), new_image_path) os.rename(os.path.join(labels_folder, label_file), new_label_path) print("文件重命名完成。") # 设置train文件夹路径 train_images_folder = '/home/ma-user/work/dataset/update_traffic_sign/images/train2017' train_labels_folder = '/home/ma-user/work/dataset/update_traffic_sign/labels/train2017' # 设置val文件夹路径 val_images_folder = '/home/ma-user/work/dataset/update_traffic_sign/images/val2017' val_labels_folder = '/home/ma-user/work/dataset/update_traffic_sign/labels/val2017' rename_img_label(train_images_folder, train_labels_folder) rename_img_label(val_images_folder, val_labels_folder)import os # 设置文件夹路径 folder_train = '/home/ma-user/work/dataset/update_traffic_sign/images/train2017' folder_val = '/home/ma-user/work/dataset/update_traffic_sign/images/val2017' output_file_train = '/home/ma-user/work/dataset/update_traffic_sign/train2017.txt' # 输出的 TXT 文件名 output_file_val = '/home/ma-user/work/dataset/update_traffic_sign/val2017.txt' # 输出的 TXT 文件名 def writetxt(floder_path, outputfilename): # 获取文件夹中的文件列表 file_names = os.listdir(floder_path) file_names.sort() # 将文件名写入 TXT 文件 with open(outputfilename, 'w') as f: for file_name in file_names: f.write(os.path.join(floder_path, file_name) + '\n') print(f"文件名已写入 {outputfilename}。") writetxt(folder_train, output_file_train) writetxt(folder_val, output_file_val)# 生成json文件(可选) import glob import json import os from PIL import Image def yolo_to_coco_for_subset(yolo_images_folder, yolo_labels_folder, categories): # Initialize COCO dataset structure for the subset coco_format = { "images": [], "annotations": [], "categories": [] } # Add category information for i, category in enumerate(categories): coco_format["categories"].append({ "id": i + 1, "name": category, "supercategory": "none" }) image_id = 0 annotation_id = 0 # image_id = -1 # annotation_id = -1 for image_file in glob.glob(f"{yolo_images_folder}/*.jpg"): print(image_file) # Read image to get width and height with Image.open(image_file) as img: width, height = img.size # Add image information with size coco_format["images"].append({ "id": image_id + 1, "file_name": os.path.basename(image_file), "width": width, "height": height }) # Corresponding annotation file yolo_annotation_file = os.path.join(yolo_labels_folder, os.path.basename(image_file).replace(".jpg", ".txt")) if os.path.exists(yolo_annotation_file): with open(yolo_annotation_file, "r") as file: for line in file: category_id, x_center, y_center, bbox_width, bbox_height = map(float, line.split()) # Convert YOLO format to COCO format x_min = (x_center - bbox_width / 2) * width y_min = (y_center - bbox_height / 2) * height coco_bbox_width = bbox_width * width coco_bbox_height = bbox_height * height # change image_id img_id = int(image_file.split('/')[-1].split('.')[-2]) print(img_id) # Add annotation information coco_format["annotations"].append({ "id": annotation_id + 1, "image_id": img_id, # image_id + 1, "category_id": int(category_id) + 1, "bbox": [x_min, y_min, coco_bbox_width, coco_bbox_height], "area": coco_bbox_width * coco_bbox_height, "segmentation": [], # Optional "iscrowd": 0 }) annotation_id += 1 image_id += 1 return coco_format def save_coco_format(coco_format, output_file): with open(output_file, "w") as file: json.dump(coco_format, file, indent=4) # Example usage yolo_base_folder = "/home/ma-user/work/dataset/update_traffic_sign/" # 父级文件夹 file_path = '/home/ma-user/work/dataset/update_traffic_sign/classes.txt' # 替换为你的txt文件路径 train_json_save_path = '/home/ma-user/work/dataset/update_traffic_sign/annotations/instances_train2017.json' # train json保存路径 val_json_save_path = '/home/ma-user/work/dataset/update_traffic_sign/annotations/instances_val2017.json' # val json保存路径 # 判断文件夹是否存在,若不存在则创建 # 判断文件夹是否存在,若不存在则创建 list_name_train = train_json_save_path.split('/')[:-1] if not os.path.exists(os.path.join(*list_name_train)): os.mkdir(os.path.join('/', *list_name_train)) # 读取txt文件并将每一行转换为列表 categories = [] with open(file_path, 'r') as file: lines = file.readlines() # 读取文件的所有行 for line in lines: categories.append(line.strip()) # 输出结果 print(categories) # Convert train set train_coco_format = yolo_to_coco_for_subset( os.path.join(yolo_base_folder, "images/train2017"), os.path.join(yolo_base_folder, "labels/train2017"), categories ) save_coco_format(train_coco_format, train_json_save_path) # Convert val set val_coco_format = yolo_to_coco_for_subset( os.path.join(yolo_base_folder, "images/val2017"), os.path.join(yolo_base_folder, "labels/val2017"), categories ) save_coco_format(val_coco_format, val_json_save_path)# 查看数据集 import os import random import cv2 import numpy as np from matplotlib import pyplot as plt %matplotlib inline # classes = ["open", "short","mousebite","spur","copper",'pin-hole'] # 类别 classes = ['i1', 'i10', 'i11', 'i12', 'i13', 'i14', 'i15', 'i2', 'i3', 'i4', 'i5', 'i6', 'i7', 'i8', 'i9', 'i50'] file_path = "/home/ma-user/work/dataset/update_traffic_sign/images/train2017" file_list = os.listdir(file_path) img_paths = random.sample(file_list, 4) img_lists = [] for img_path in img_paths: img_path = os.path.join(file_path, img_path) img = cv2.imread(img_path) h, w, _ = img.shape tl = round(0.002 * (h + w) / 2) + 1 color = [random.randint(0, 255) for _ in range(3)] if img_path.endswith('.png'): with open(img_path.replace("images", "labels").replace(".png", ".txt")) as f: labels = f.readlines() if img_path.endswith('.jpg'): with open(img_path.replace("images", "labels").replace(".jpg", ".txt")) as f: labels = f.readlines() for label in labels: l, x, y, wc, hc = [float(x) for x in label.strip().split()] x1 = int((x - wc / 2) * w) y1 = int((y - hc / 2) * h) x2 = int((x + wc / 2) * w) y2 = int((y + hc / 2) * h) cv2.rectangle(img, (x1, y1), (x2, y2), color, thickness=tl, lineType=cv2.LINE_AA) cv2.putText(img,classes[int(l)],(x1,y1-2), cv2.FONT_HERSHEY_SIMPLEX, 1, (0,111,222), 3, cv2.LINE_AA) img_lists.append(cv2.resize(img, (1280, 720))) image = np.concatenate([np.concatenate(img_lists[:2], axis=1), np.concatenate(img_lists[2:], axis=1)], axis=0) plt.rcParams["figure.figsize"] = (20, 10) plt.imshow(image[:,:,::-1]) plt.axis('off') plt.show()【准备mindyolo模型】链接:mindyolo# 步骤1:从OBS迁移数据集 import moxing as mox mox.file.copy_parallel("obs://trafficbuckets/source_code/mindyolo.zip", "/home/ma-user/work/mindyolo.zip") # 形参:源文件->指定位置【修改配置文件】1.yolov8n.yaml以及其继承的coco.yaml,hyp.scratch.low.yaml,yolov8-base.yaml的配置信息coco.yamldata: dataset_name: update_traffic_sign # you can change! train_set: /home/ma-user/work/dataset/update_traffic_sign/train2017.txt # ./coco/train2017.txt # 118287 images # you can change! val_set: /home/ma-user/work/dataset/update_traffic_sign/val2017.txt # ./coco/val2017.txt # 5000 images # you can change! test_set: /home/ma-user/work/dataset/update_traffic_sign/test2017.txt # 20288 of 40670 images, submit to https://competitions.codalab.org/competitions/20794 # you can change! nc: 16 # class names names: ['i1', 'i10', 'i11', 'i12', 'i13', 'i14', 'i15', 'i2', 'i3', 'i4', 'i5', 'i6', 'i7', 'i8', 'i9', 'i50'] # you can change! train_transforms: [] test_transforms: []yolov8-base.yamlepochs: 500 # total train epochs # you can change! per_batch_size: 2 # 16 * 8 = 128 img_size: 2048 iou_thres: 0.7 conf_free: True sync_bn: True opencv_threads_num: 0 # opencv: disable threading optimizations network: model_name: yolov8 nc: 16 # number of classes # you can change! reg_max: 16 stride: [8, 16, 32] ...【执行训练】python train.py --config ./configs/yolov8/yolov8n.yaml --run_eval True
  • [活动分享] 桂林电子科技大学布道师活动——香橙派实践与大模型分享
    【活动总结】2024 年 11月 1日-2024年11月15日,华为开发者布道师、桂林电子科技大学李明辉同学在线下组织了围绕mindspore框架和昇腾边缘设备的两次布道活动,本次参与活动的同学来自数学与计算机科学学院与人工智能学院共两百余名学生。活动开始,李明辉同学以通俗易懂的语言为大家详细介绍了MindSpore框架的优势。他强调,MindSpore作为一款面向AI应用的全场景深度学习框架,具有易用、高效、安全等特点,为广大开发者提供了极大的便利。在场的大一同学们纷纷表示对MindSpore框架产生了浓厚兴趣。针对大一新生,李明辉同学贴心地为他们规划了学习昇思框架的路线,并手把手教会了同学们如何利用ATC(Ascend Tensor Compiler)转换自己的模型。这一环节让在场的同学们对人工智能技术的实际应用有了更深刻的认识。NPU案例分享在NPU案例分享环节,李明辉同学向大家展示了魔乐社区的体验空间,并详细讲解了如何离线部署属于自己的模型。这一环节让同学们感受到了人工智能技术的魅力,也为他们今后的学术研究和项目实践奠定了基础。案例分享紧接着,李明辉同学为本科生们介绍了华为云最新云主机,让同学们亲身体验到了云开发的便捷与高效。他通过实际操作演示,让同学们对华为云的技术实力有了更直观的了解。活动照片1活动照片2活动照片3此次活动的成功举办,不仅提高了同学们对人工智能技术的认识,还为他们提供了一个交流学习的平台。相信在未来的学习和实践中,同学们将不断探索,为我国人工智能领域的发展贡献自己的力量。