-
直播回放链接:cid:link_0
-
Ascend310部署Qwen-VL-7B实现吸烟动作识别OrangePi AI Studio Pro是基于2个昇腾310P处理器的新一代高性能推理解析卡,提供基础通用算力+超强AI算力,整合了训练和推理的全部底层软件栈,实现训推一体。其中AI半精度FP16算力约为176TFLOPS,整数Int8精度可达352TOPS,本文将带领大家在Ascend 310P上部署Qwen2.5-VL-7B多模态理解大模型实现吸烟动作的识别。一、环境配置我们在OrangePi AI Stuido上使用Docker容器部署MindIE:docker pull swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.1.RC1-300I-Duo-py311-openeuler24.03-ltsroot@orangepi:~# docker images REPOSITORY TAG IMAGE ID CREATED SIZE swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie 2.1.RC1-300I-Duo-py311-openeuler24.03-lts 0574b8d4403f 3 months ago 20.4GB langgenius/dify-web 1.0.1 b2b7363571c2 8 months ago 475MB langgenius/dify-api 1.0.1 3dd892f50a2d 8 months ago 2.14GB langgenius/dify-plugin-daemon 0.0.4-local 3f180f39bfbe 8 months ago 1.35GB ubuntu/squid latest dae40da440fe 8 months ago 243MB postgres 15-alpine afbf3abf6aeb 8 months ago 273MB nginx latest b52e0b094bc0 9 months ago 192MB swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie 1.0.0-300I-Duo-py311-openeuler24.03-lts 74a5b9615370 10 months ago 17.5GB redis 6-alpine 6dd588768b9b 10 months ago 30.2MB langgenius/dify-sandbox 0.2.10 4328059557e8 13 months ago 567MB semitechnologies/weaviate 1.19.0 8ec9f084ab23 2 years ago 52.5MB之后创建一个名为start-docker.sh的启动脚本,内容如下:NAME=$1 if [ $# -ne 1 ]; then echo "warning: need input container name.Use default: mindie" NAME=mindie fi docker run --name ${NAME} -it -d --net=host --shm-size=500g \ --privileged=true \ -w /usr/local/Ascend/atb-models \ --device=/dev/davinci_manager \ --device=/dev/hisi_hdc \ --device=/dev/devmm_svm \ --entrypoint=bash \ -v /models:/models \ -v /data:/data \ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ -v /usr/local/dcmi:/usr/local/dcmi \ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ -v /usr/local/sbin:/usr/local/sbin \ -v /home:/home \ -v /tmp:/tmp \ -v /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime \ -e http_proxy=$http_proxy \ -e https_proxy=$https_proxy \ -e "PATH=/usr/local/python3.11.6/bin:$PATH" \ swr.cn-south-1.myhuaweicloud.com/ascendhub/mindie:2.1.RC1-300I-Duo-py311-openeuler24.03-ltsbash start-docker.sh启动容器后,我们需要替换几个文件并安装Ascend-cann-nnal软件包:root@orangepi:~# docker exec -it mindie bash Welcome to 5.15.0-126-generic System information as of time: Sat Nov 15 22:06:48 CST 2025 System load: 1.87 Memory used: 6.3% Swap used: 0.0% Usage On: 33% Users online: 0 [root@orangepi atb-models]# cd /usr/local/Ascend/ascend-toolkit/8.2.RC1/lib64/ [root@orangepi lib64]# ls /data/fix_openeuler_docker/fixhccl/8.2hccl/ libhccl.so libhccl_alg.so libhccl_heterog.so libhccl_plf.so [root@orangepi lib64]# cp /data/fix_openeuler_docker/fixhccl/8.2hccl/* ./ cp: overwrite './libhccl.so'? cp: overwrite './libhccl_alg.so'? cp: overwrite './libhccl_heterog.so'? cp: overwrite './libhccl_plf.so'? [root@orangepi lib64]# source /usr/local/Ascend/ascend-toolkit/set_env.sh [root@orangepi lib64]# chmod +x /data/fix_openeuler_docker/Ascend-cann-nnal/Ascend-cann-nnal_8.3.RC1_linux-x86_64.run [root@orangepi lib64]# /data/fix_openeuler_docker/Ascend-cann-nnal/Ascend-cann-nnal_8.3.RC1_linux-x86_64.run --install --quiet [NNAL] [20251115-22:41:45] [INFO] LogFile:/var/log/ascend_seclog/ascend_nnal_install.log [NNAL] [20251115-22:41:45] [INFO] Ascend-cann-atb_8.3.RC1_linux-x86_64.run --install --install-path=/usr/local/Ascend/nnal --install-for-all --quiet --nox11 start WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv [NNAL] [20251115-22:41:58] [INFO] Ascend-cann-atb_8.3.RC1_linux-x86_64.run --install --install-path=/usr/local/Ascend/nnal --install-for-all --quiet --nox11 install success [NNAL] [20251115-22:41:58] [INFO] Ascend-cann-SIP_8.3.RC1_linux-x86_64.run --install --install-path=/usr/local/Ascend/nnal --install-for-all --quiet --nox11 start [NNAL] [20251115-22:41:59] [INFO] Ascend-cann-SIP_8.3.RC1_linux-x86_64.run --install --install-path=/usr/local/Ascend/nnal --install-for-all --quiet --nox11 install success [NNAL] [20251115-22:41:59] [INFO] Ascend-cann-nnal_8.3.RC1_linux-x86_64.run install success Warning!!! If the environment variables of atb and asdsip are set at the same time, unexpected consequences will occur. Import the corresponding environment variables based on the usage scenarios: atb for large model scenarios, asdsip for embedded scenarios. Please make sure that the environment variables have been configured. If you want to use atb module: - To take effect for current user, you can exec command below: source /usr/local/Ascend/nnal/atb/set_env.sh or add "source /usr/local/Ascend/nnal/atb/set_env.sh" to ~/.bashrc. If you want to use asdsip module: - To take effect for current user, you can exec command below: source /usr/local/Ascend/nnal/asdsip/set_env.sh or add "source /usr/local/Ascend/nnal/asdsip/set_env.sh" to ~/.bashrc. [root@orangepi lib64]# cat /usr/local/Ascend/nnal/atb/latest/version.info Ascend-cann-atb : 8.3.RC1 Ascend-cann-atb Version : 8.3.RC1.B106 Platform : x86_64 branch : 8.3.rc1-0702 commit id : 16004f23040e0dcdd3cf0c64ecf36622487038ba修改推理使用的逻辑NPU核心为0,1,测试多模态理解大模型:Qwen2.5-VL-7B-Instruct:运行结果表明,Qwen2.5-VL-7B-Instruct在2 x Ascned 310P上推理平均每秒可以输出20个tokens,同时准确理解画面中的人物信息和行为动作。[root@orangepi atb-models]# bash examples/models/qwen2_vl/run_pa.sh --model_path /models/Qwen2.5-VL-7B-Instruct/ --input_image /root/pic/test.jpg [2025-11-15 22:12:49,663] torch.distributed.run: [WARNING] [2025-11-15 22:12:49,663] torch.distributed.run: [WARNING] ***************************************** [2025-11-15 22:12:49,663] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. [2025-11-15 22:12:49,663] torch.distributed.run: [WARNING] ***************************************** /usr/local/lib64/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source? warn( /usr/local/lib64/python3.11/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source? warn( 2025-11-15 22:12:53.250 7934 LLM log default format: [yyyy-mm-dd hh:mm:ss.uuuuuu] [processid] [threadid] [llmmodels] [loglevel] [file:line] [status code] msg 2025-11-15 22:12:53.250 7933 LLM log default format: [yyyy-mm-dd hh:mm:ss.uuuuuu] [processid] [threadid] [llmmodels] [loglevel] [file:line] [status code] msg [2025-11-15 22:12:53.250] [7934] [139886327420160] [llmmodels] [WARN] [model_factory.cpp:28] deepseekV2_DecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7933] [139649439929600] [llmmodels] [WARN] [model_factory.cpp:28] deepseekV2_DecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7934] [139886327420160] [llmmodels] [WARN] [model_factory.cpp:28] deepseekV2_DecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7933] [139649439929600] [llmmodels] [WARN] [model_factory.cpp:28] deepseekV2_DecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7934] [139886327420160] [llmmodels] [WARN] [model_factory.cpp:28] llama_LlamaDecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:53.250] [7933] [139649439929600] [llmmodels] [WARN] [model_factory.cpp:28] llama_LlamaDecoderModel model already exists, but the duplication doesn't matter. [2025-11-15 22:12:55,335] [7934] [139886327420160] [llmmodels] [INFO] [cpu_binding.py-254] : rank_id: 1, device_id: 1, numa_id: 0, shard_devices: [0, 1], cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] [2025-11-15 22:12:55,336] [7934] [139886327420160] [llmmodels] [INFO] [cpu_binding.py-280] : process 7934, new_affinity is [8, 9, 10, 11, 12, 13, 14, 15], cpu count 8 [2025-11-15 22:12:55,356] [7933] [139649439929600] [llmmodels] [INFO] [cpu_binding.py-254] : rank_id: 0, device_id: 0, numa_id: 0, shard_devices: [0, 1], cpus: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] [2025-11-15 22:12:55,357] [7933] [139649439929600] [llmmodels] [INFO] [cpu_binding.py-280] : process 7933, new_affinity is [0, 1, 2, 3, 4, 5, 6, 7], cpu count 8 [2025-11-15 22:12:56,032] [7933] [139649439929600] [llmmodels] [INFO] [model_runner.py-156] : model_runner.quantize: None, model_runner.kv_quant_type: None, model_runner.fa_quant_type: None, model_runner.dtype: torch.float16 [2025-11-15 22:13:01,826] [7933] [139649439929600] [llmmodels] [INFO] [dist.py-81] : initialize_distributed has been Set [2025-11-15 22:13:01,827] [7933] [139649439929600] [llmmodels] [INFO] [model_runner.py-187] : init tokenizer done Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. [2025-11-15 22:13:02,070] [7934] [139886327420160] [llmmodels] [INFO] [dist.py-81] : initialize_distributed has been Set Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.48, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. [W InferFormat.cpp:62] Warning: Cannot create tensor with NZ format while dim < 2, tensor will be created with ND format. (function operator()) [W InferFormat.cpp:62] Warning: Cannot create tensor with NZ format while dim < 2, tensor will be created with ND format. (function operator()) [2025-11-15 22:13:08,435] [7933] [139649439929600] [llmmodels] [INFO] [flash_causal_qwen2.py-153] : >>>> qwen_QwenDecoderModel is called. [2025-11-15 22:13:08,526] [7934] [139886327420160] [llmmodels] [INFO] [flash_causal_qwen2.py-153] : >>>> qwen_QwenDecoderModel is called. [2025-11-15 22:13:16.666] [7933] [139649439929600] [llmmodels] [WARN] [operation_factory.cpp:42] OperationName: TransdataOperation not find in operation factory map [2025-11-15 22:13:16.698] [7934] [139886327420160] [llmmodels] [WARN] [operation_factory.cpp:42] OperationName: TransdataOperation not find in operation factory map [2025-11-15 22:13:22,379] [7933] [139649439929600] [llmmodels] [INFO] [model_runner.py-282] : model: FlashQwen2vlForCausalLM( (rotary_embedding): PositionRotaryEmbedding() (attn_mask): AttentionMask() (vision_tower): Qwen25VisionTransformerPretrainedModelATB( (encoder): Qwen25VLVisionEncoderATB( (layers): ModuleList( (0-31): 32 x Qwen25VLVisionLayerATB( (attn): VisionAttention( (qkv): TensorParallelColumnLinear( (linear): FastLinear() ) (proj): TensorParallelRowLinear( (linear): FastLinear() ) ) (mlp): VisionMlp( (gate_up_proj): TensorParallelColumnLinear( (linear): FastLinear() ) (down_proj): TensorParallelRowLinear( (linear): FastLinear() ) ) (norm1): BaseRMSNorm() (norm2): BaseRMSNorm() ) ) (patch_embed): FastPatchEmbed( (proj): TensorReplicatedLinear( (linear): FastLinear() ) ) (patch_merger): PatchMerger( (patch_merger_mlp_0): TensorParallelColumnLinear( (linear): FastLinear() ) (patch_merger_mlp_2): TensorParallelRowLinear( (linear): FastLinear() ) (patch_merger_ln_q): BaseRMSNorm() ) ) (rotary_pos_emb): VisionRotaryEmbedding() ) (language_model): FlashQwen2UsingMROPEForCausalLM( (rotary_embedding): PositionRotaryEmbedding() (attn_mask): AttentionMask() (transformer): FlashQwenModel( (wte): TensorEmbeddingWithoutChecking() (h): ModuleList( (0-27): 28 x FlashQwenLayer( (attn): FlashQwenAttention( (rotary_emb): PositionRotaryEmbedding() (c_attn): TensorParallelColumnLinear( (linear): FastLinear() ) (c_proj): TensorParallelRowLinear( (linear): FastLinear() ) ) (mlp): QwenMLP( (act): SiLU() (w2_w1): TensorParallelColumnLinear( (linear): FastLinear() ) (c_proj): TensorParallelRowLinear( (linear): FastLinear() ) ) (ln_1): QwenRMSNorm() (ln_2): QwenRMSNorm() ) ) (ln_f): QwenRMSNorm() ) (lm_head): TensorParallelHead( (linear): FastLinear() ) ) ) [2025-11-15 22:13:24,268] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-134] : hbm_capacity(GB): 87.5078125, init_memory(GB): 11.376015624962747 [2025-11-15 22:13:24,789] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-342] : pa_runner: PARunner(model_path=/models/Qwen2.5-VL-7B-Instruct/, input_text=请用超过500个字详细说明图片的内容,并仔细判断画面中的人物是否有吸烟动作。, max_position_embeddings=None, max_input_length=16384, max_output_length=1024, max_prefill_tokens=-1, load_tokenizer=True, enable_atb_torch=False, max_prefill_batch_size=None, max_batch_size=1, dtype=torch.float16, block_size=128, model_config=ModelConfig(num_heads=14, num_kv_heads=2, num_kv_heads_origin=4, head_size=128, k_head_size=128, v_head_size=128, num_layers=28, device=npu:0, dtype=torch.float16, soc_info=NPUSocInfo(soc_name='', soc_version=200, need_nz=True, matmul_nd_nz=False), kv_quant_type=None, fa_quant_type=None, mapping=Mapping(world_size=2, rank=0, num_nodes=1,pp_rank=0, pp_groups=[[0], [1]], micro_batch_size=1, attn_dp_groups=[[0], [1]], attn_tp_groups=[[0, 1]], attn_inner_sp_groups=[[0], [1]], attn_cp_groups=[[0], [1]], attn_o_proj_tp_groups=[[0], [1]], mlp_tp_groups=[[0, 1]], moe_ep_groups=[[0], [1]], moe_tp_groups=[[0, 1]]), cla_share_factor=1, model_type=qwen2_5_vl, enable_nz=False), max_memory=93960798208, [2025-11-15 22:13:24,794] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-122] : ---------------Begin warm_up--------------- [2025-11-15 22:13:24,794] [7933] [139649439929600] [llmmodels] [INFO] [cache.py-154] : kv cache will allocate 0.46484375GB memory [2025-11-15 22:13:24,821] [7934] [139886327420160] [llmmodels] [INFO] [cache.py-154] : kv cache will allocate 0.46484375GB memory [2025-11-15 22:13:24,827] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1139] : ------total req num: 1, infer start-------- [2025-11-15 22:13:26,002] [7934] [139886327420160] [llmmodels] [INFO] [flash_causal_qwen2.py-680] : <<<<<<<after transdata k_caches[0].shape=torch.Size([136, 16, 128, 16]) [2025-11-15 22:13:26,023] [7933] [139649439929600] [llmmodels] [INFO] [flash_causal_qwen2.py-676] : <<<<<<< ori k_caches[0].shape=torch.Size([136, 16, 128, 16]) [2025-11-15 22:13:26,023] [7933] [139649439929600] [llmmodels] [INFO] [flash_causal_qwen2.py-680] : <<<<<<<after transdata k_caches[0].shape=torch.Size([136, 16, 128, 16]) [2025-11-15 22:13:26,024] [7933] [139649439929600] [llmmodels] [INFO] [flash_causal_qwen2.py-705] : >>>>>>id of kcache is 139645634198608 id of vcache is 139645634198320 [2025-11-15 22:13:34,363] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1294] : Prefill time: 9476.590633392334ms, Prefill average time: 9476.590633392334ms, Decode token time: 54.94809150695801ms, E2E time: 9531.538724899292ms [2025-11-15 22:13:34,363] [7934] [139886327420160] [llmmodels] [INFO] [generate.py-1294] : Prefill time: 9452.020645141602ms, Prefill average time: 9452.020645141602ms, Decode token time: 54.654598236083984ms, E2E time: 9506.675243377686ms [2025-11-15 22:13:34,366] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1326] : -------------------performance dumped------------------------ [2025-11-15 22:13:34,371] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1329] : | batch_size | input_seq_len | output_seq_len | e2e_time(ms) | prefill_time(ms) | decoder_token_time(ms) | prefill_count | prefill_average_time(ms) | |-------------:|----------------:|-----------------:|---------------:|-------------------:|-------------------------:|----------------:|---------------------------:| | 1 | 16384 | 2 | 9531.54 | 9476.59 | 54.95 | 1 | 9476.59 | /usr/local/lib64/python3.11/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True). warnings.warn( [2025-11-15 22:13:35,307] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-148] : warmup_memory(GB): 15.75 [2025-11-15 22:13:35,307] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-153] : ---------------End warm_up--------------- /usr/local/lib64/python3.11/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True). warnings.warn( [2025-11-15 22:13:35,363] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1139] : ------total req num: 1, infer start-------- [2025-11-15 22:13:50,021] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1294] : Prefill time: 1004.0028095245361ms, Prefill average time: 1004.0028095245361ms, Decode token time: 13.301290491575836ms, E2E time: 14611.222982406616ms [2025-11-15 22:13:50,021] [7934] [139886327420160] [llmmodels] [INFO] [generate.py-1294] : Prefill time: 1067.9974555969238ms, Prefill average time: 1067.9974555969238ms, Decode token time: 13.300292536193908ms, E2E time: 14674.196720123291ms [2025-11-15 22:13:50,025] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1326] : -------------------performance dumped------------------------ [2025-11-15 22:13:50,028] [7933] [139649439929600] [llmmodels] [INFO] [generate.py-1329] : | batch_size | input_seq_len | output_seq_len | e2e_time(ms) | prefill_time(ms) | decoder_token_time(ms) | prefill_count | prefill_average_time(ms) | |-------------:|----------------:|-----------------:|---------------:|-------------------:|-------------------------:|----------------:|---------------------------:| | 1 | 1675 | 1024 | 14611.2 | 1004 | 13.3 | 1 | 1004 | [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-385] : Question[0]: [{'image': '/root/pic/test.jpg'}, {'text': '请用超过500个字详细说明图片的内容,并仔细判断画面中的人物是否有吸烟动作。'}] [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-386] : Answer[0]: 这张图片展示了一个无人机航拍的场景,画面中可以看到两名工人站在一个雪地或冰面上。他们穿着橙色的安全背心和红色的安全帽,显得非常醒目。背景中可以看到一些雪地和一些金属结构,可能是桥梁或工业设施的一部分。 从图片的细节来看,画面右侧的工人右手放在嘴边,似乎在吸烟。他的姿势和动作与吸烟者的典型姿势相符。然而,由于图片的分辨率和角度限制,无法完全确定这个动作是否真实发生。如果要准确判断,可能需要更多的视频片段或更清晰的图像。 从无人机航拍的角度来看,这个场景可能是在进行某种工业或建筑项目的检查或监控。两名工人可能正在进行现场检查或讨论工作事宜。雪地和金属结构表明这可能是一个寒冷的冬季,或者是一个寒冷的气候区域。 无人机航拍技术在工业和建筑领域中非常常见,因为它可以提供高空视角,帮助工程师和管理人员更好地了解现场情况。这种技术不仅可以节省时间和成本,还可以提高工作效率和安全性。在进行航拍时,确保遵守当地的法律法规和安全规定是非常重要的。 总的来说,这张图片展示了一个无人机航拍的场景,画面中两名工人站在雪地上,其中一人似乎在吸烟。虽然无法完全确定这个动作是否真实发生,但根据他们的姿势和动作,可以合理推测这个动作的存在。 [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-387] : Generate[0] token num: 282 [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-389] : Latency(s): 14.721353530883789 [2025-11-15 22:13:50,035] [7933] [139649439929600] [llmmodels] [INFO] [run_pa.py-390] : Throughput(tokens/s): 19.15584728050956 本文详细介绍了在OrangePi AI Studio上使用Docker容器部署MindIE环境并运行Qwen2.5-VL-7B-Instruct多模态大模型实现吸烟动作识别的完整过程,验证了在Ascned 310p设备上运行多模态理解大模型的可靠性。
-
npu-smi 命令获取不了Serial Number,是缺少什么组件还是其他?root@davinci-mini:/home/HwHiAiUser# npu-smi info -t board -i 0 NPU ID : 0 Product Name : Model : Manufacturer : Serial Number : Software Version : 21.0.3.1 Firmware Version : 1.79.22.5.220 Board ID : 0xbbc PCB ID : NA BOM ID : 0 Chip Count : 1 Faulty Chip Count : 0root@davinci-mini:/home/HwHiAiUser# npu-smi info+------------------------------------------------------------------------------+| npu-smi 21.0.3.1 Version: 21.0.3.1 |+-------------------+-----------------+----------------------------------------+| NPU Name | Health | Power(W) Temp(C) || Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) |+===================+=================+========================================+| 0 310 | OK | 8.0 51 || 0 0 | NA | 0 3440 / 8192 |+===================+=================+========================================+
-
自己写的代码在原本的华为云服务器notebook上运行是可以正常运行的,但是最近在新买的notebook运行报错如图,新买的notebook实例ID是c768c7a7-178f-41b8-86cb-6aaeda31b331,想问一下是新买的notebook哪里出了问题
-
1. 下载模型权重 安装python环境 conda create -n qwq_model python==3.13.6 conda activate qwq_model pip install modelscope 通过 modelscope SDK下载模型(https://www.modelscope.cn/models/Qwen/QwQ-32B)到制定目录 mkdir -p /usr/local/data/model_list/model/QwQ-32B modelscope download --model Qwen/QwQ-32B --local_dir /usr/local/data/model_list/model/QwQ-32B 2. 部署模型 vim /etc/sysctl.conf 设置 net.ipv4.ip_forward的值为1 source /etc/sysctl.conf docker pull swr.cn-southwest-2.myhuaweicloud.com/atelier/pytorch_ascend:pytorch_2.5.1-cann_8.2.rc1-py_3.11-hce_2.0.2503-aarch64-snt9b-20250729103313-3a25129 启动容器 docker run -itd \--device=/dev/davinci0 \--device=/dev/davinci1 \--device=/dev/davinci2 \--device=/dev/davinci3 \-v /etc/localtime:/etc/localtime \-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \-v /etc/ascend_install.info:/etc/ascend_install.info \--device=/dev/davinci_manager \--device=/dev/devmm_svm \--device=/dev/hisi_hdc \-v /var/log/npu/:/usr/slog \-v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi \-v /sys/fs/cgroup:/sys/fs/cgroup:ro \-v /usr/local/data/model_list/model:/usr/local/data/model_list/model \--net=host \--name vllm-qwen \91c374f329e4 \/bin/bash 来到容器环境 docker exec -it -u ma-user ${container_name} /bin/bashdocker exec -it -u ma-user vllm-qwen /bin/bash设置容器里的参数export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 export VLLM_PLUGINS=ascend # VPC网段# 需用户手动修改,修改方式见下方注意事项VPC_CIDR="192.168.0.0/16" VPC_PREFIX=$(echo "$VPC_CIDR" | cut -d'/' -f1 | cut -d'.' -f1-2)POD_INET_IP=$(ifconfig | grep -oP "(?<=inet\s)$VPC_PREFIX\.\d+\.\d+" | head -n 1)POD_NETWORK_IFNAME=$(ifconfig | grep -B 1 "$POD_INET_IP" | head -n 1 | awk '{print $1}' | sed 's/://')echo "POD_INET_IP: $POD_INET_IP"echo "POD_NETWORK_IFNAME: $POD_NETWORK_IFNAME" # 指定通信网卡export GLOO_SOCKET_IFNAME=$POD_NETWORK_IFNAMEexport TP_SOCKET_IFNAME=$POD_NETWORK_IFNAMEexport HCCL_SOCKET_IFNAME=$POD_NETWORK_IFNAME# 多机场景下配置export RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES=1 # 开启显存优化export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True# 配置通信算法的编排展开位置在Device侧的AI Vector Core计算单元export HCCL_OP_EXPANSION_MODE=AIV# 指定可使用的卡,按需指定export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7# 指定绑核,按需指定export CPU_AFFINITY_CONF=1export LD_PRELOAD=/usr/local/lib/libjemalloc.so.2:${LD_PRELOAD}# 默认启用 ascend-turbo-graph模式,指定启动插件export VLLM_PLUGINS=ascend_vllm# 如果使用 acl-graph 或者 eager 模式,指定启动插件 # export VLLM_PLUGINS=ascend# 指定vllm后端 v1export VLLM_USE_V1=1# 指定vllm版本export VLLM_VERSION=0.9.0 export USE_MM_ALL_REDUCE_OP=1export MM_ALL_REDUCE_OP_THRESHOLD=256 # 不需要设置以下环境变量unset ENABLE_QWEN_HYPERDRIVE_OPTunset ENABLE_QWEN_MICROBATCHunset ENABLE_PHASE_AWARE_QKVO_QUANTunset DISABLE_QWEN_DP_PROJ source /home/ma-user/AscendCloud/AscendTurbo/set_env.bash 运行API服务 nohup python -m vllm.entrypoints.openai.api_server \--model /usr/local/data/model_list/model/QwQ-32B \--max-num-seqs=256 \--max-model-len=512 \--max-num-batched-tokens=512 \--tensor-parallel-size=4 \--block-size=128 \--host=192.168.0.127 \--port=18186 \--gpu-memory-utilization=0.95 \--trust-remote-code \--no-enable-prefix-caching \--additional-config='{"ascend_turbo_graph_config": {"enabled": true}, "ascend_scheduler_config": {"enabled": true}}' > QwQ-32B.log 2>&1 & port端口号可以自定义,勿与已经使用的端口号冲突 3. 验证API服务 验证服务 curl http://192.168.0.127:18186/v1/completions \-H "Content-Type: application/json" \-d '{ "model": "/usr/local/data/model_list/model/QwQ-32B", "prompt": "What is moon","max_tokens": 64,"temperature": 0.5 }'
-
您好我正在notebook配置上手昇腾相关环境,需要一些额外的存储空间来装数据和其他文件,但是我在配置外挂obs的时候遇到了一些问题,还请问这里文档说《选择运行中的Notebook实例,单击实例名称,进入Notebook实例详情页面,在“存储配置”页签,单击“添加数据存储”,设置挂载参数》但是我按照说明点进了notebook详情页但是并没有找到挂载pfs的地方,还请老师指教这个《存储配置》页签在哪里?此外我看北京4,上海1都可以挂载obs,但是没有昇腾算力。还请老师帮助解决。
-
【朝推夜训】Ascend310p YOLOv8 NPU 训练和推理在华为昇思MindSpore框架的加持下,我们在OrangePi AI Studio Pro开发板上实现YOLOv8m模型的完整训练流程。在单块NPU上训练YOLOv8m模型,每轮训练7000张图像仅需6.92分钟,10轮训练总耗时约69分钟。从训练日志可以看出,模型损失值loss从第一轮的6.45逐步下降到最后一轮的2.58左右,表明模型训练效果良好。训练过程中,NPU的AICore利用率和内存占用情况都保持在合理水平,证明了Ascend 310P芯片在目标检测任务中的优异表现,其性能可与NVIDIA GPU相媲美,为开发者提供了另一种高效的AI计算平台选择。通过mindyolo开源仓库,其他开发者也可以复现这一成果并进行进一步的开发和优化。我们在昇腾310AI加速卡上使用昇思MindSpore把YOLOv8模型的NPU训练和推理给跑通了,性能不输于NVIDIA的GPU。OrangePi AI Stuido Pro与Atlas 300V Pro视频解析卡搭载是同款Ascend 310p芯片,总共是两块,每块有96G的内存,可以提供176TFlops的训练算力和352Tops的推理算力。上图是在单块NPU上训练yolov8m模型的AICore的利用率以及内存的占用情况,总共7000张图像每轮训练时长仅需6.92分钟:2025-09-24 16:47:11,931 [INFO] 2025-09-24 16:47:11,931 [INFO] Please check the above information for the configurations 2025-09-24 16:47:12,050 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 16:47:12,069 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 16:47:12,184 [INFO] number of network params, total: 25.896391M, trainable: 25.863252M 2025-09-24 16:47:16,786 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 16:47:16,807 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 16:47:16,920 [INFO] number of network params, total: 25.896391M, trainable: 25.863252M 2025-09-24 16:47:31,011 [INFO] ema_weight not exist, default pretrain weight is currently used. 2025-09-24 16:47:31,118 [INFO] Dataset Cache file hash/version check success. 2025-09-24 16:47:31,118 [INFO] Load dataset cache from [/home/orangepi/workspace/mindyolo/examples/finetune_visdrone/train.cache.npy] success. 2025-09-24 16:47:31,142 [INFO] Dataloader num parallel workers: [8] 2025-09-24 16:47:31,240 [INFO] Dataset Cache file hash/version check success. 2025-09-24 16:47:31,240 [INFO] Load dataset cache from [/home/orangepi/workspace/mindyolo/examples/finetune_visdrone/train.cache.npy] success. 2025-09-24 16:47:31,264 [INFO] Dataloader num parallel workers: [8] 2025-09-24 16:47:31,438 [INFO] 2025-09-24 16:47:31,445 [INFO] got 1 active callback as follows: 2025-09-24 16:47:31,445 [INFO] SummaryCallback() 2025-09-24 16:47:31,445 [WARNING] The first epoch will be compiled for the graph, which may take a long time; You can come back later :). 2025-09-24 16:50:38,076 [INFO] Epoch 1/10, Step 100/404, imgsize (640, 640), loss: 6.4507, lbox: 3.8446, lcls: 0.5687, dfl: 2.0375, cur_lr: 0.09257426112890244 2025-09-24 16:50:38,970 [INFO] Epoch 1/10, Step 100/404, step time: 1875.26 ms 2025-09-24 16:52:21,629 [INFO] Epoch 1/10, Step 200/404, imgsize (640, 640), loss: 4.8078, lbox: 3.0080, lcls: 0.4118, dfl: 1.3880, cur_lr: 0.08514851331710815 2025-09-24 16:52:21,653 [INFO] Epoch 1/10, Step 200/404, step time: 1026.83 ms 2025-09-24 16:54:04,347 [INFO] Epoch 1/10, Step 300/404, imgsize (640, 640), loss: 4.0795, lbox: 2.4281, lcls: 0.3466, dfl: 1.3048, cur_lr: 0.07772277295589447 2025-09-24 16:54:04,371 [INFO] Epoch 1/10, Step 300/404, step time: 1027.18 ms 2025-09-24 16:55:47,067 [INFO] Epoch 1/10, Step 400/404, imgsize (640, 640), loss: 3.8245, lbox: 2.1755, lcls: 0.3567, dfl: 1.2923, cur_lr: 0.07029703259468079 2025-09-24 16:55:47,091 [INFO] Epoch 1/10, Step 400/404, step time: 1027.19 ms 2025-09-24 16:55:52,087 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-1_404.ckpt 2025-09-24 16:55:52,087 [INFO] Epoch 1/10, epoch time: 8.34 min. 2025-09-24 16:57:34,759 [INFO] Epoch 2/10, Step 100/404, imgsize (640, 640), loss: 3.8083, lbox: 2.2584, lcls: 0.3404, dfl: 1.2095, cur_lr: 0.062162574380636215 2025-09-24 16:57:34,768 [INFO] Epoch 2/10, Step 100/404, step time: 1026.80 ms 2025-09-24 16:59:17,441 [INFO] Epoch 2/10, Step 200/404, imgsize (640, 640), loss: 3.7835, lbox: 2.2670, lcls: 0.3574, dfl: 1.1592, cur_lr: 0.05465514957904816 2025-09-24 16:59:17,450 [INFO] Epoch 2/10, Step 200/404, step time: 1026.82 ms 2025-09-24 17:01:00,127 [INFO] Epoch 2/10, Step 300/404, imgsize (640, 640), loss: 3.5251, lbox: 2.0144, lcls: 0.3210, dfl: 1.1898, cur_lr: 0.0471477210521698 2025-09-24 17:01:00,136 [INFO] Epoch 2/10, Step 300/404, step time: 1026.85 ms 2025-09-24 17:02:42,826 [INFO] Epoch 2/10, Step 400/404, imgsize (640, 640), loss: 3.5596, lbox: 2.0947, lcls: 0.3086, dfl: 1.1563, cur_lr: 0.03964029625058174 2025-09-24 17:02:42,835 [INFO] Epoch 2/10, Step 400/404, step time: 1026.99 ms 2025-09-24 17:02:47,745 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-2_404.ckpt 2025-09-24 17:02:47,745 [INFO] Epoch 2/10, epoch time: 6.93 min. 2025-09-24 17:04:30,489 [INFO] Epoch 3/10, Step 100/404, imgsize (640, 640), loss: 3.5524, lbox: 2.1004, lcls: 0.2938, dfl: 1.1582, cur_lr: 0.031090890988707542 2025-09-24 17:04:30,497 [INFO] Epoch 3/10, Step 100/404, step time: 1027.52 ms 2025-09-24 17:06:13,196 [INFO] Epoch 3/10, Step 200/404, imgsize (640, 640), loss: 3.8549, lbox: 2.2845, lcls: 0.3526, dfl: 1.2178, cur_lr: 0.02350178174674511 2025-09-24 17:06:13,205 [INFO] Epoch 3/10, Step 200/404, step time: 1027.07 ms 2025-09-24 17:07:55,875 [INFO] Epoch 3/10, Step 300/404, imgsize (640, 640), loss: 3.6236, lbox: 2.1016, lcls: 0.3113, dfl: 1.2106, cur_lr: 0.015912672504782677 2025-09-24 17:07:55,883 [INFO] Epoch 3/10, Step 300/404, step time: 1026.78 ms 2025-09-24 17:09:38,572 [INFO] Epoch 3/10, Step 400/404, imgsize (640, 640), loss: 3.5586, lbox: 2.0730, lcls: 0.3314, dfl: 1.1542, cur_lr: 0.008323564194142818 2025-09-24 17:09:38,581 [INFO] Epoch 3/10, Step 400/404, step time: 1026.97 ms 2025-09-24 17:09:43,528 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-3_404.ckpt 2025-09-24 17:09:43,529 [INFO] Epoch 3/10, epoch time: 6.93 min. 2025-09-24 17:11:26,211 [INFO] Epoch 4/10, Step 100/404, imgsize (640, 640), loss: 3.3767, lbox: 1.9760, lcls: 0.2928, dfl: 1.1079, cur_lr: 0.007029999978840351 2025-09-24 17:11:26,218 [INFO] Epoch 4/10, Step 100/404, step time: 1026.90 ms 2025-09-24 17:13:08,899 [INFO] Epoch 4/10, Step 200/404, imgsize (640, 640), loss: 3.4213, lbox: 1.9382, lcls: 0.3052, dfl: 1.1779, cur_lr: 0.007029999978840351 2025-09-24 17:13:08,908 [INFO] Epoch 4/10, Step 200/404, step time: 1026.89 ms 2025-09-24 17:14:51,583 [INFO] Epoch 4/10, Step 300/404, imgsize (640, 640), loss: 2.8313, lbox: 1.5666, lcls: 0.2380, dfl: 1.0267, cur_lr: 0.007029999978840351 2025-09-24 17:14:51,591 [INFO] Epoch 4/10, Step 300/404, step time: 1026.83 ms 2025-09-24 17:16:34,277 [INFO] Epoch 4/10, Step 400/404, imgsize (640, 640), loss: 3.2905, lbox: 1.9274, lcls: 0.2889, dfl: 1.0741, cur_lr: 0.007029999978840351 2025-09-24 17:16:34,285 [INFO] Epoch 4/10, Step 400/404, step time: 1026.94 ms 2025-09-24 17:16:39,232 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-4_404.ckpt 2025-09-24 17:16:39,232 [INFO] Epoch 4/10, epoch time: 6.93 min. 2025-09-24 17:18:21,892 [INFO] Epoch 5/10, Step 100/404, imgsize (640, 640), loss: 3.1534, lbox: 1.7844, lcls: 0.2581, dfl: 1.1109, cur_lr: 0.006039999891072512 2025-09-24 17:18:21,900 [INFO] Epoch 5/10, Step 100/404, step time: 1026.67 ms 2025-09-24 17:20:04,596 [INFO] Epoch 5/10, Step 200/404, imgsize (640, 640), loss: 3.1152, lbox: 1.7685, lcls: 0.2518, dfl: 1.0949, cur_lr: 0.006039999891072512 2025-09-24 17:20:04,604 [INFO] Epoch 5/10, Step 200/404, step time: 1027.04 ms 2025-09-24 17:21:47,284 [INFO] Epoch 5/10, Step 300/404, imgsize (640, 640), loss: 3.3179, lbox: 1.8412, lcls: 0.2888, dfl: 1.1880, cur_lr: 0.006039999891072512 2025-09-24 17:21:47,292 [INFO] Epoch 5/10, Step 300/404, step time: 1026.88 ms 2025-09-24 17:23:29,968 [INFO] Epoch 5/10, Step 400/404, imgsize (640, 640), loss: 3.2193, lbox: 1.8366, lcls: 0.2620, dfl: 1.1207, cur_lr: 0.006039999891072512 2025-09-24 17:23:29,976 [INFO] Epoch 5/10, Step 400/404, step time: 1026.84 ms 2025-09-24 17:23:34,954 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-5_404.ckpt 2025-09-24 17:23:34,954 [INFO] Epoch 5/10, epoch time: 6.93 min. 2025-09-24 17:25:17,530 [INFO] Epoch 6/10, Step 100/404, imgsize (640, 640), loss: 2.7642, lbox: 1.5834, lcls: 0.2164, dfl: 0.9643, cur_lr: 0.005049999803304672 2025-09-24 17:25:17,538 [INFO] Epoch 6/10, Step 100/404, step time: 1025.84 ms 2025-09-24 17:27:00,125 [INFO] Epoch 6/10, Step 200/404, imgsize (640, 640), loss: 2.6854, lbox: 1.4272, lcls: 0.2080, dfl: 1.0502, cur_lr: 0.005049999803304672 2025-09-24 17:27:00,134 [INFO] Epoch 6/10, Step 200/404, step time: 1025.96 ms 2025-09-24 17:28:42,720 [INFO] Epoch 6/10, Step 300/404, imgsize (640, 640), loss: 2.7541, lbox: 1.5028, lcls: 0.2171, dfl: 1.0342, cur_lr: 0.005049999803304672 2025-09-24 17:28:42,728 [INFO] Epoch 6/10, Step 300/404, step time: 1025.94 ms 2025-09-24 17:30:25,315 [INFO] Epoch 6/10, Step 400/404, imgsize (640, 640), loss: 2.8092, lbox: 1.5545, lcls: 0.2121, dfl: 1.0427, cur_lr: 0.005049999803304672 2025-09-24 17:30:25,323 [INFO] Epoch 6/10, Step 400/404, step time: 1025.95 ms 2025-09-24 17:30:30,293 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-6_404.ckpt 2025-09-24 17:30:30,294 [INFO] Epoch 6/10, epoch time: 6.92 min. 2025-09-24 17:32:12,881 [INFO] Epoch 7/10, Step 100/404, imgsize (640, 640), loss: 3.0997, lbox: 1.8226, lcls: 0.2402, dfl: 1.0369, cur_lr: 0.00406000018119812 2025-09-24 17:32:12,890 [INFO] Epoch 7/10, Step 100/404, step time: 1025.96 ms 2025-09-24 17:33:55,477 [INFO] Epoch 7/10, Step 200/404, imgsize (640, 640), loss: 2.8140, lbox: 1.5979, lcls: 0.2143, dfl: 1.0018, cur_lr: 0.00406000018119812 2025-09-24 17:33:55,485 [INFO] Epoch 7/10, Step 200/404, step time: 1025.96 ms 2025-09-24 17:35:38,072 [INFO] Epoch 7/10, Step 300/404, imgsize (640, 640), loss: 3.0294, lbox: 1.6439, lcls: 0.2544, dfl: 1.1310, cur_lr: 0.00406000018119812 2025-09-24 17:35:38,081 [INFO] Epoch 7/10, Step 300/404, step time: 1025.95 ms 2025-09-24 17:37:20,660 [INFO] Epoch 7/10, Step 400/404, imgsize (640, 640), loss: 2.8015, lbox: 1.5686, lcls: 0.2252, dfl: 1.0077, cur_lr: 0.00406000018119812 2025-09-24 17:37:20,669 [INFO] Epoch 7/10, Step 400/404, step time: 1025.88 ms 2025-09-24 17:37:25,643 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-7_404.ckpt 2025-09-24 17:37:25,644 [INFO] Epoch 7/10, epoch time: 6.92 min. 2025-09-24 17:39:08,227 [INFO] Epoch 8/10, Step 100/404, imgsize (640, 640), loss: 2.5091, lbox: 1.3373, lcls: 0.1711, dfl: 1.0007, cur_lr: 0.0030700000934302807 2025-09-24 17:39:08,236 [INFO] Epoch 8/10, Step 100/404, step time: 1025.92 ms 2025-09-24 17:40:50,818 [INFO] Epoch 8/10, Step 200/404, imgsize (640, 640), loss: 2.5926, lbox: 1.4141, lcls: 0.1923, dfl: 0.9863, cur_lr: 0.0030700000934302807 2025-09-24 17:40:50,826 [INFO] Epoch 8/10, Step 200/404, step time: 1025.91 ms 2025-09-24 17:42:33,392 [INFO] Epoch 8/10, Step 300/404, imgsize (640, 640), loss: 2.5341, lbox: 1.3811, lcls: 0.1869, dfl: 0.9660, cur_lr: 0.0030700000934302807 2025-09-24 17:42:33,400 [INFO] Epoch 8/10, Step 300/404, step time: 1025.74 ms 2025-09-24 17:44:15,994 [INFO] Epoch 8/10, Step 400/404, imgsize (640, 640), loss: 3.0024, lbox: 1.6379, lcls: 0.2284, dfl: 1.1361, cur_lr: 0.0030700000934302807 2025-09-24 17:44:16,002 [INFO] Epoch 8/10, Step 400/404, step time: 1026.02 ms 2025-09-24 17:44:20,974 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-8_404.ckpt 2025-09-24 17:44:20,975 [INFO] Epoch 8/10, epoch time: 6.92 min. 2025-09-24 17:46:03,561 [INFO] Epoch 9/10, Step 100/404, imgsize (640, 640), loss: 3.0890, lbox: 1.8395, lcls: 0.2321, dfl: 1.0174, cur_lr: 0.0020800000056624413 2025-09-24 17:46:03,569 [INFO] Epoch 9/10, Step 100/404, step time: 1025.94 ms 2025-09-24 17:47:46,157 [INFO] Epoch 9/10, Step 200/404, imgsize (640, 640), loss: 2.9621, lbox: 1.6608, lcls: 0.2360, dfl: 1.0652, cur_lr: 0.0020800000056624413 2025-09-24 17:47:46,166 [INFO] Epoch 9/10, Step 200/404, step time: 1025.96 ms 2025-09-24 17:49:28,755 [INFO] Epoch 9/10, Step 300/404, imgsize (640, 640), loss: 2.4801, lbox: 1.3320, lcls: 0.1753, dfl: 0.9728, cur_lr: 0.0020800000056624413 2025-09-24 17:49:28,763 [INFO] Epoch 9/10, Step 300/404, step time: 1025.97 ms 2025-09-24 17:51:11,359 [INFO] Epoch 9/10, Step 400/404, imgsize (640, 640), loss: 2.8075, lbox: 1.5971, lcls: 0.1995, dfl: 1.0109, cur_lr: 0.0020800000056624413 2025-09-24 17:51:11,367 [INFO] Epoch 9/10, Step 400/404, step time: 1026.03 ms 2025-09-24 17:51:16,330 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-9_404.ckpt 2025-09-24 17:51:16,331 [INFO] Epoch 9/10, epoch time: 6.92 min. 2025-09-24 17:52:58,913 [INFO] Epoch 10/10, Step 100/404, imgsize (640, 640), loss: 2.6278, lbox: 1.4529, lcls: 0.1860, dfl: 0.9889, cur_lr: 0.0010900000343099236 2025-09-24 17:52:58,921 [INFO] Epoch 10/10, Step 100/404, step time: 1025.90 ms 2025-09-24 17:54:41,521 [INFO] Epoch 10/10, Step 200/404, imgsize (640, 640), loss: 2.7550, lbox: 1.5724, lcls: 0.2083, dfl: 0.9742, cur_lr: 0.0010900000343099236 2025-09-24 17:54:41,529 [INFO] Epoch 10/10, Step 200/404, step time: 1026.08 ms 2025-09-24 17:56:24,125 [INFO] Epoch 10/10, Step 300/404, imgsize (640, 640), loss: 2.4470, lbox: 1.2448, lcls: 0.1758, dfl: 1.0263, cur_lr: 0.0010900000343099236 2025-09-24 17:56:24,133 [INFO] Epoch 10/10, Step 300/404, step time: 1026.03 ms 2025-09-24 17:58:06,727 [INFO] Epoch 10/10, Step 400/404, imgsize (640, 640), loss: 2.5783, lbox: 1.3733, lcls: 0.1848, dfl: 1.0202, cur_lr: 0.0010900000343099236 2025-09-24 17:58:06,736 [INFO] Epoch 10/10, Step 400/404, step time: 1026.02 ms 2025-09-24 17:58:11,744 [INFO] Saving model to ./runs/2025.09.24-16.47.11/weights/yolov8m-10_404.ckpt 2025-09-24 17:58:11,745 [INFO] Epoch 10/10, epoch time: 6.92 min. 2025-09-24 17:58:12,149 [INFO] End Train. 2025-09-24 17:58:12,561 [INFO] Training completed.以下是模型训练了10个epoch的使用NPU在测试集图片上的推理结果:2025-09-24 18:13:24,511 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 18:13:24,532 [WARNING] Parse Model, args: nearest, keep str type 2025-09-24 18:13:24,639 [INFO] number of network params, total: 25.896391M, trainable: 25.863252M 2025-09-24 18:13:29,405 [INFO] Load checkpoint from [/home/orangepi/workspace/mindyolo/runs/2025.09.24-16.47.11/weights/yolov8m-10_404.ckpt] success. 2025-09-24 18:13:53,915 [INFO] Predict result is: {'category_id': [4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 1, 4, 4, 5, 10, 4, 1, 4, 2, 4, 1, 5, 10, 4, 2, 4, 1], 'bbox': [[866.402, 359.922, 125.209, 179.961], [619.836, 379.246, 140.848, 229.434], [704.238, 192.678, 102.631, 112.359], [572.588, 189.689, 108.707, 103.76], [80.484, 471.75, 334.953, 243.844], [739.99, 15.987, 60.305, 60.944], [1179.242, 68.017, 143.637, 56.163], [1220.215, 154.843, 138.523, 76.782], [1217.559, 108.026, 140.516, 63.733], [822.475, 15.34, 56.744, 75.039], [621.438, 70.781, 19.938, 55.292], [1106.859, 128.463, 79.986, 95.99], [773.168, 90.047, 71.42, 95.293], [773.467, 88.951, 70.988, 95.924], [1122.158, 371.145, 48.12, 90.512], [1168.982, 2.274, 83.141, 77.081], [723.45, 65.277, 21.877, 51.017], [1145.906, 0.556, 76.467, 46.708], [672.513, 71.818, 25.857, 46.933], [488.816, 350.559, 107.844, 117.605], [672.778, 71.918, 26.172, 48.194], [1106.826, 128.612, 79.621, 96.239], [1058.831, 319.314, 35.087, 75.056], [1146.62, 0.365, 54.586, 48.643], [1124.963, 370.945, 42.359, 66.473], [1148.197, 1.046, 92.537, 51.581], [526.153, 87.349, 29.123, 37.91]], 'score': [0.93223, 0.92336, 0.90671, 0.90539, 0.84414, 0.83682, 0.83292, 0.75641, 0.74857, 0.74295, 0.72221, 0.63341, 0.62439, 0.5829, 0.50411, 0.48259, 0.42391, 0.42188, 0.42185, 0.36533, 0.29963, 0.29451, 0.29264, 0.28265, 0.26525, 0.2585, 0.25038]} 2025-09-24 18:13:53,915 [INFO] Speed: 24481.6/5.7/24487.3 ms inference/NMS/total per 640x640 image at batch-size 1; 2025-09-24 18:13:53,915 [INFO] Detect a image success. 2025-09-24 18:13:53,924 [INFO] Infer completed.模型训练和推理代码可以从mindyolo仓库上下载:https://github.com/mindspore-lab/mindyolo
-
如何在OrangePi Studio Pro上升级CANN以及的Pytorch和MindSpore1. 安装 CANN 和 Pytorch首先我们在昇腾资源下载中心硬件信息中产品系列选择:加速卡,产品型号选择:Atlas 300V Pro 视频解析卡,CANN版本选择:8.2.RC1,下载CANN相关软件包,获取Pytorch源码。下载完成后,就安装CANN以及Pytorch了,我使用的OrangePi制作的预装好AI环境的Ubuntu22.04测试镜像,因此只需要升级Ascend-cann-toolkit_8.2.RC1_linux-x86_64.run和Ascend-cann-kernels-310p_8.2.RC1_linux-x86_64.run以及torch_npu-2.1.0.post13-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl。首先我们切换到root用户安装更新依赖包列表安装g++-12:sudo apt update sudo apt install -y g++-12之后进入CANN软件包下载目录,依次执行下面的命令进行安装:chmod +x ./Ascend-cann-toolkit_8.2.RC1_linux-x86_64.run ./Ascend-cann-toolkit_8.2.RC1_linux-x86_64.run --full --quiet chmod +x ./Ascend-cann-kernels-310p_8.2.RC1_linux-x86_64.run ./Ascend-cann-kernels-310p_8.2.RC1_linux-x86_64.run --install --quiet pip3 install torch_npu-2.1.0.post13-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl执行如下命令,验证是cann和torch_npu是否安装成功:source /usr/local/Ascend/ascend-toolkit/set_env.sh python3 -c "import torch;import torch_npu; a = torch.randn(3, 4).npu(); print(a + a);" 2. 升级 MindSpore 版本我们访问MindSpore官网,CANN版本选择我们刚刚安装的CANN 8.2.RC1,其他配置根据自己的设备选择:切换到root用户执行如下安装命令:sudo su pip3 install mindspore==2.7.0 -i https://repo.mindspore.cn/pypi/simple --trusted-host repo.mindspore.cn --extra-index-url https://repo.huaweicloud.com/repository/pypi/simple安装完成后我们可以执行如下验证命令测试是否安装成功:source /usr/local/Ascend/ascend-toolkit/set_env.sh python3 -c "import mindspore;mindspore.set_context(device_target='Ascend');mindspore.run_check()" 如果输出下面的结果就证明 MindSpore 安装成功了![WARNING] ME(1621400:139701939115840,MainProcess):2025-09-24-10:46:21.978.000 [mindspore/context.py:1412] For 'context.set_context', the parameter 'device_target' will be deprecated and removed in a future version. Please use the api mindspore.set_device() instead. MindSpore version: 2.7.0 [WARNING] GE_ADPT(1621400,7f0e18710640,python3):2025-09-24-10:46:23.323.570 [mindspore/ops/kernel/ascend/acl_ir/op_api_exec.cc:169] GetAscendDefaultCustomPath] Checking whether the so exists or if permission to access it is available: /usr/local/Ascend/ascend-toolkit/latest/opp/vendors/customize_vision/op_api/lib/libcust_opapi.so The result of multiplication calculation is correct, MindSpore has been installed on platform [Ascend] successfully! 3. 小结本文详细介绍了在OrangePi Studio Pro开发板上升级CANN、PyTorch和MindSpore AI框架的完整流程。通过本文的指导,开发者可以轻松地将这些关键的AI组件升级到最新版本,从而充分发挥OrangePi Studio Pro硬件平台的AI计算能力。
-
昇腾平台文生文大模型安装技术洞察 1. 检查环境 1.1 确保NPU设备无异常 npu-smi info # 在每个实例节点上运行此命令可以看到NPU卡状态npu-smi info -l | grep Total # 在每个实例节点上运行此命令可以看到总卡数,用来确认对应卡数已经挂载npu-smi info -t board -i 1 | egrep -i "software|firmware" #查看驱动和固件版本1.2 确保docker无异常 docker -v #检查docker是否安装yum install -y docker-engine.aarch64 docker-engine-selinux.noarch docker-runc.aarch641.3配置IP转发 vim /etc/sysctl.conf 设置 net.ipv4.ip_forward=1source /etc/sysctl.conf 2. 制作容器2.1 获取镜像 docker pull swr.cn-southwest-2.myhuaweicloud.com/ei_ascendcloud_devops/llm_inference:906_a2_20250821 这是运行大模型服务的镜像。 2.2 启动容器 docker run -itd \--device=/dev/davinci0 \--device=/dev/davinci1 \--device=/dev/davinci2 \--device=/dev/davinci3 \--device=/dev/davinci4 \--device=/dev/davinci5 \--device=/dev/davinci6 \--device=/dev/davinci7 \-v /etc/localtime:/etc/localtime \-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \-v /etc/ascend_install.info:/etc/ascend_install.info \--device=/dev/davinci_manager \--device=/dev/devmm_svm \--device=/dev/hisi_hdc \-v /var/log/npu/:/usr/slog \-v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi \-v /sys/fs/cgroup:/sys/fs/cgroup:ro \-v ${dir}:${container_model_path} \--net=host \--name ${container_name} \${image_id} \/bin/bash --name ${container_name}:容器名称,进入容器时会用到,此处可以自己定义一个容器名称。 {image_id} 为docker镜像的ID,可通过docker images查询 实例:docker run -itd \--device=/dev/davinci0 \--device=/dev/davinci1 \--device=/dev/davinci2 \--device=/dev/davinci3 \--device=/dev/davinci4 \--device=/dev/davinci5 \--device=/dev/davinci6 \--device=/dev/davinci7 \-v /etc/localtime:/etc/localtime \-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \-v /etc/ascend_install.info:/etc/ascend_install.info \--device=/dev/davinci_manager \--device=/dev/devmm_svm \--device=/dev/hisi_hdc \-v /var/log/npu/:/usr/slog \-v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi \-v /sys/fs/cgroup:/sys/fs/cgroup:ro \-v /usr/local/data/model_list/model:/usr/local/data/model_list/model \--net=host \--name vllm-qwen \91c374f329e4 \/bin/bash 2.3 制作容器环境 运行命令:docker exec -it -u ma-user ${container_name} /bin/bash export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7export VLLM_PLUGINS=ascend # VPC网段# 需用户手动修改,修改方式见下方注意事项;VPC_CIDR为服务器内网ipVPC_CIDR="192.168.0.0/16" VPC_PREFIX=$(echo "$VPC_CIDR" | cut -d'/' -f1 | cut -d'.' -f1-2)POD_INET_IP=$(ifconfig | grep -oP "(?<=inet\s)$VPC_PREFIX\.\d+\.\d+" | head -n 1)POD_NETWORK_IFNAME=$(ifconfig | grep -B 1 "$POD_INET_IP" | head -n 1 | awk '{print $1}' | sed 's/://')echo "POD_INET_IP: $POD_INET_IP"echo "POD_NETWORK_IFNAME: $POD_NETWORK_IFNAME" # 指定通信网卡export GLOO_SOCKET_IFNAME=$POD_NETWORK_IFNAMEexport TP_SOCKET_IFNAME=$POD_NETWORK_IFNAMEexport HCCL_SOCKET_IFNAME=$POD_NETWORK_IFNAME# 多机场景下配置export RAY_EXPERIMENTAL_NOSET_ASCEND_RT_VISIBLE_DEVICES=1 # 开启显存优化export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True# 配置通信算法的编排展开位置在Device侧的AI Vector Core计算单元export HCCL_OP_EXPANSION_MODE=AIV# 指定可使用的卡,按需指定export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7# 指定绑核,按需指定export CPU_AFFINITY_CONF=1export LD_PRELOAD=/usr/local/lib/libjemalloc.so.2:${LD_PRELOAD}# 默认启用 ascend-turbo-graph模式,指定启动插件export VLLM_PLUGINS=ascend_vllm# 如果使用 acl-graph 或者 eager 模式,指定启动插件 # export VLLM_PLUGINS=ascend# 指定vllm后端 v1export VLLM_USE_V1=1# 指定vllm版本export VLLM_VERSION=0.9.0 export USE_MM_ALL_REDUCE_OP=1export MM_ALL_REDUCE_OP_THRESHOLD=256 # 不需要设置以下环境变量unset ENABLE_QWEN_HYPERDRIVE_OPTunset ENABLE_QWEN_MICROBATCHunset ENABLE_PHASE_AWARE_QKVO_QUANTunset DISABLE_QWEN_DP_PROJ source /home/ma-user/AscendCloud/AscendTurbo/set_env.bash 2.4 运行大模型API服务 nohup python -m vllm.entrypoints.openai.api_server \--model /usr/local/data/model_list/model/QwQ-32B \--max-num-seqs=256 \--max-model-len=512 \--max-num-batched-tokens=512 \--tensor-parallel-size=4 \--block-size=128 \--host=192.168.0.127 \--port=18186 \--gpu-memory-utilization=0.95 \--trust-remote-code \--no-enable-prefix-caching \--additional-config='{"ascend_turbo_graph_config": {"enabled": true}, "ascend_scheduler_config": {"enabled": true}}' > QwQ-32B.log 2>&1 & model为大模型权重文档的路径host为服务器的内网ip,可通过ifconfig查询port为API的端口号,可自定义QwQ-32B.log为写入的日志文档,可自定义 2.5 验证大模型API服务 curl http://${docker_ip}:8080/v1/completions \-H "Content-Type: application/json" \-d '{ "model": "${container_model_path}", "prompt": "hello","max_tokens": 128,"temperature": 0 }'${docker_ip}替换为实际宿主机的IP地址${container_model_path} 的值为大模型路径 API启动命令实例:curl http://192.168.0.127:18186/v1/completions \-H "Content-Type: application/json" \-d '{ "model": "/usr/local/data/model_list/model/QwQ-32B", "prompt": "What is moon","max_tokens": 128,"temperature": 0.5 }' 返回结果实例: {"id":"cmpl-e96e239e2a3b490da361622879eb9c2c","object":"text_completion","created":1757919227,"model":"/usr/local/data/model_list/model/QwQ-32B","choices":[{"index":0,"text":"light made of?\n\nWhat is moon made of?\n\nPlease tell me if those questions are the same.\nOkay, so I need to figure out what moonlight is made of and what the moon itself is made of. Let me start by breaking down each question.\n\nFirst, \"What is moonlight made of?\" Hmm, moonlight. I know that the moon doesn't produce its own light. So, moonlight must be reflected sunlight, right? Like, the sun shines on the moon, and then the moon reflects that light back to Earth. So, if that's the case, then moonlight is just sunlight that's been reflected","logprobs":null,"finish_reason":"length","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":3,"total_tokens":131,"completion_tokens":128,"prompt_tokens_details":null},"kv_transfer_params":null}
-
cid:link_0
-
贵阳一机器,实例ID c768c7a7-9633-47d0-adcf-4ed17a252381 名称notebook-c51aERROR 08-25 09:20:47 [core.py:586] File "/vllm-workspace/LMCache-Ascend/lmcache_ascend/integration/vllm/vllm_v1_adapter.py", line 155, in init_lmcache_engineERROR 08-25 09:20:47 [core.py:586] engine = LMCacheEngineBuilder.get_or_create(ERROR 08-25 09:20:47 [core.py:586] File "/vllm-workspace/LMCache/lmcache/v1/cache_engine.py", line 947, in get_or_createERROR 08-25 09:20:47 [core.py:586] memory_allocator = cls._Create_memory_allocator(config, metadata)ERROR 08-25 09:20:47 [core.py:586] File "/vllm-workspace/LMCache-Ascend/lmcache_ascend/v1/cache_engine.py", line 21, in _ascend_create_memory_allocatorERROR 08-25 09:20:47 [core.py:586] return AscendMixedMemoryAllocator(int(max_local_cpu_size * 1024**3))ERROR 08-25 09:20:47 [core.py:586] File "/vllm-workspace/LMCache-Ascend/lmcache_ascend/v1/memory_management.py", line 69, in __init__ERROR 08-25 09:20:47 [core.py:586] lmc_ops.host_register(self.buffer)ERROR 08-25 09:20:47 [core.py:586] RuntimeError: Unable to pin host memory with error code: -1ERROR 08-25 09:20:47 [core.py:586] Exception raised from halRegisterHostPtr at /vllm-workspace/LMCache-Ascend/csrc/managed_mem.cpp:109 (most recent call first):ERROR 08-25 09:20:47 [core.py:586] frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0xb8 (0xfffc2cf2c908 in /usr/local/python3.10.17/lib/python3.10/site-packages/torch/lib/libc10.so)ERROR 08-25 09:20:47 [core.py:586] frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x6c (0xfffc2cedb404 in /usr/local/python3.10.17/lib/python3.10/site-packages/torch/lib/libc10.so)ERROR 08-25 09:20:47 [core.py:586] frame #2: <unknown function> + 0x1abf8 (0xfff9c407abf8 in /vllm-workspace/LMCache-Ascend/lmcache_ascend/c_ops.cpython-310-aarch64-linux-gnu.so)运行 Lmcache-ascend 遇到了上述问题,主要是由于可以 pin 的 host memory 有限制,原因是 CPU 的内存锁定方法存在问题,系统的内存锁定限制过低,且在容器环境下没有权限执行 ulimit -l unlimited 来提升内存锁定限制。同时无法调整服务的配置,放开内存锁定---以下是参考资料调整 containerd 服务的配置,放开内存锁定的限制,具体步骤如下: 修改 containerd 服务配置文件:找到 containerd 服务的配置文件,通常路径为 /usr/lib/systemd/system/containerd.service(不同系统可能路径有差异,可通过 systemctl status containerd 查看服务配置文件路径)。添加内存锁定限制配置:在配置文件的 [Service] 部分,添加 LimitMEMLOCK=infinity 配置项,该配置项用于设置内存锁定的限制为无限制。
-
RuntimeError: Unable to pin host memory with error code: -1 · Issue #5 · LMCache/LMCache-Ascend在跑 LMCACHE-ASCEND 的时候,发现会出现如上的错误主要的解决方式就是:部署实例的时候使用 LimitMEMLOCK 或者通过 ulimit -l 解决上限但是由于 notebook 中没有 root 权限,所以无法通过后者解决;由于无法使用 docker run 语句 和 docker-compose 所以无法通过前者解决;想问一下要怎么解决这个内存限制问题
-
以下是 dockerfile 的文件# # Copyright (c) 2025 Huawei Technologies Co., Ltd. All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # FROM quay.io/ascend/cann:8.2.rc1-910b-openeuler22.03-py3.11 # Set the user ma-user whose UID is 1000 and the user group ma-group whose GID is 100 USER root RUN default_user=$(getent passwd 1000 | awk -F ':' '{print $1}') || echo "uid: 1000 does not exist" && \ default_group=$(getent group 100 | awk -F ':' '{print $1}') || echo "gid: 100 does not exist" && \ if [ ! -z ${default_user} ] && [ ${default_user} != "ma-user" ]; then \ userdel -r ${default_user}; \ fi && \ if [ ! -z ${default_group} ] && [ ${default_group} != "ma-group" ]; then \ groupdel -f ${default_group}; \ fi && \ groupadd -g 100 ma-group && useradd -d /home/ma-user -m -u 1000 -g 100 -s /bin/bash ma-user && \ chmod -R 750 /home/ma-user ARG PIP_INDEX_URL="https://mirrors.aliyun.com/pypi/simple" ARG COMPILE_CUSTOM_KERNELS=1 ENV COMPILE_CUSTOM_KERNELS=${COMPILE_CUSTOM_KERNELS} RUN yum update -y && \ yum install -y python3-pip git vim wget net-tools gcc gcc-c++ make cmake numactl-devel && \ rm -rf /var/cache/yum RUN pip config set global.index-url ${PIP_INDEX_URL} # Set pip source to a faster mirror RUN pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple WORKDIR /workspace COPY . /workspace/LMCache-Ascend/ # Install vLLM ARG VLLM_REPO=https://githubfast.com/vllm-project/vllm.git ARG VLLM_TAG=v0.9.2 RUN git clone --depth 1 $VLLM_REPO --branch $VLLM_TAG /workspace/vllm # In x86, triton will be installed by vllm. But in Ascend, triton doesn't work correctly. we need to uninstall it. RUN VLLM_TARGET_DEVICE="empty" python3 -m pip install -e /workspace/vllm/ --extra-index https://download.pytorch.org/whl/cpu/ --retries 5 --timeout 30 && \ python3 -m pip uninstall -y triton # Install vLLM-Ascend ARG VLLM_ASCEND_REPO=https://githubfast.com/vllm-project/vllm-ascend.git ARG VLLM_ASCEND_TAG=v0.9.2rc1 RUN git clone --depth 1 $VLLM_ASCEND_REPO --branch $VLLM_ASCEND_TAG /workspace/vllm-ascend RUN cd /workspace/vllm-ascend && \ git apply -p1 /workspace/LMCache-Ascend/docker/kv-connector-v1.diff RUN export PIP_EXTRA_INDEX_URL=https://mirrors.huaweicloud.com/ascend/repos/pypi && \ source /usr/local/Ascend/ascend-toolkit/set_env.sh && \ source /usr/local/Ascend/nnal/atb/set_env.sh && \ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/Ascend/ascend-toolkit/latest/`uname -i`-linux/devlib && \ python3 -m pip install -v -e /workspace/vllm-ascend/ --extra-index https://download.pytorch.org/whl/cpu/ # Install modelscope (for fast download) and ray (for multinode) RUN python3 -m pip install modelscope ray # Install LMCache ARG LMCACHE_REPO=https://githubfast.com/LMCache/LMCache.git ARG LMCACHE_TAG=v0.3.3 RUN git clone --depth 1 $LMCACHE_REPO --branch $LMCACHE_TAG /workspace/LMCache # our build is based on arm64 RUN sed -i "s/^infinistore$/infinistore; platform_machine == 'x86_64'/" /workspace/LMCache/requirements/common.txt # Install LMCache with retries and timeout RUN export NO_CUDA_EXT=1 && python3 -m pip install -v -e /workspace/LMCache --retries 5 --timeout 30 # Install LMCache-Ascend RUN cd /workspace/LMCache-Ascend && \ source /usr/local/Ascend/ascend-toolkit/set_env.sh && \ source /usr/local/Ascend/nnal/atb/set_env.sh && \ export SOC_VERSION=ASCEND910B3 && \ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/Ascend/ascend-toolkit/latest/`uname -i`-linux/devlib && \ python3 -m pip install -v --no-build-isolation -e . && \ python3 -m pip cache purge # Switch to user ma-user USER ma-user CMD ["/bin/bash"] 注册的镜像选项如下镜像管理界面创建notebook的参数但是最后创建 notebook 失败了
-
直播回放链接:cid:link_0 本次直播基于云开发环境与全栈工具链深度体验昇腾鲲鹏等根技术生态, 三大课程体系:AI系列(DeepSeek/MCP)、鲲鹏调优、MySQL实战,涵盖人工智能系列含MCP智能体协议开发实战鲲鹏性能调优及MySQL数据库实战课程,重点介绍开发者空间三大系列课程与MCP协议揭秘并提供百万级DeepSeek Tokens资源。开发者空间系列精品课,面向个人开发者、高校开发者和企业开发者,结合开发者特点并基于空间能力,开发个性化体系化精品课程。每门课程都存在配合理论知识的实操手册案例,在开发者空间中进行实验操作,实现边学边练,从而达到掌握知识的目标。课程链接地址:cid:link_1 结合当前热门技术,进行逐步分层的技能深入拓展,按照技术领域和职业发展方向规划学习路径,帮助开发者快速获取所需技能。 Q:什么是MCPA:MCP(Model Context Protocol,模型上下文协议) ,2024年11月底,由 Anthropic 推出的一种开放标准,旨在统一大型语言模型(LLM)与外部数据源和工具之间的通信协议。MCP是基于JSON-RPC 2.0的协议,是一种Client/Server架构,提供了多种语言(Java、TypeScipt、Python、Kotlin)的MCP Client SDK和MCP Server SDK。Q:MCP Server 和我们平时说的后端服务器(Backend Server)是一个东西吗?A:有关联,角色定位不同。可以理解为后端服务器是一个“后厨”,负责做菜,执行业务逻辑,访问数据库,和调用算法等等。而MCP Server则是站在后厨门口的“专业点餐员”,它懂AI助手语言,就是MCP协议,负责接收AI订单,并翻译给“后厨”,也就是后端服务器。Q:链式调用多个工具的MCP工作流时(比如:先搜索资料 -> 再总结 -> 最后发邮件), 我应该在Client(AI助手)层面编排这些步骤,还是在MCP Server内部封装这个完整流程? 这两种方案各有什么优劣?A:优势:灵活性高,可解释性强;缺点:依赖AI能力,需要有较强的AI逻辑能力。Q:请问开发者空间云主机免费使用时长是总共180小时,还是每年180小时免费使用时长?A:目前是每年180小时免费使用时长。Q:自己尝试做一个 MCP Server,需要什么准备?难不难?A:( 不难 )入门并不难!你只需要:基础知识:基本的Python或Node.js编程能力。核心环境:一个AI助手平台(如Cursor、Cherry Studio、Claude Desktop等)作为MCP Client。你可以从做一个最简单的“天气查询Server”或“备忘录Server”开始,一两个小时就能跑通第一个例子,体验非常好!华为开发者空间,让开发者低门槛体验华为工具和资源,是为全球开发者打造的专属开发空间,汇聚昇腾、鸿蒙、鲲鹏、GaussDB、欧拉等各项根技术的开发资源及工具,致力于为每位开发者提供一台云主机、一套开发工具和云上存储空间,为开发者提供AI时代的智能应用开发体验,集成AI原生应用引擎,开发者可一键生成智能Agent,调用MCP Server插件能力,快速构建个性化AI应用。快来学习体验吧!
推荐直播
-
华为云码道-玩转OpenClaw,在线养虾2026/03/11 周三 19:00-21:00
刘昱,华为云高级工程师/谈心,华为云技术专家/李海仑,上海圭卓智能科技有限公司CEO
OpenClaw 火爆开发者圈,华为云码道最新推出 Skill ——开发者只需输入一句口令,即可部署一个功能完整的「小龙虾」智能体。直播带你玩转华为云码道,玩转OpenClaw
回顾中 -
华为云码道-AI时代应用开发利器2026/03/18 周三 19:00-20:00
童得力,华为云开发者生态运营总监/姚圣伟,华为云HCDE开发者专家
本次直播由华为专家带你实战应用开发,看华为云码道(CodeArts)代码智能体如何在AI时代让你的创意应用快速落地。更有华为云HCDE开发者专家带你用码道玩转JiuwenClaw,让小艺成为你的AI助理。
回顾中 -
Skill 构建 × 智能创作:基于华为云码道的 AI 内容生产提效方案2026/03/25 周三 19:00-20:00
余伟,华为云软件研发工程师/万邵业(万少),华为云HCDE开发者专家
本次直播带来两大实战:华为云码道 Skill-Creator 手把手搭建专属知识库 Skill;如何用码道提效 OpenClaw 小说文本,打造从大纲到成稿的 AI 原创小说全链路。技术干货 + OPC创作思路,一次讲透!
回顾中
热门标签