-
直播回放链接:cid:link_0 本次直播基于云开发环境与全栈工具链深度体验昇腾鲲鹏等根技术生态, 三大课程体系:AI系列(DeepSeek/MCP)、鲲鹏调优、MySQL实战,涵盖人工智能系列含MCP智能体协议开发实战鲲鹏性能调优及MySQL数据库实战课程,重点介绍开发者空间三大系列课程与MCP协议揭秘并提供百万级DeepSeek Tokens资源。开发者空间系列精品课,面向个人开发者、高校开发者和企业开发者,结合开发者特点并基于空间能力,开发个性化体系化精品课程。每门课程都存在配合理论知识的实操手册案例,在开发者空间中进行实验操作,实现边学边练,从而达到掌握知识的目标。课程链接地址:cid:link_1 结合当前热门技术,进行逐步分层的技能深入拓展,按照技术领域和职业发展方向规划学习路径,帮助开发者快速获取所需技能。 Q:什么是MCPA:MCP(Model Context Protocol,模型上下文协议) ,2024年11月底,由 Anthropic 推出的一种开放标准,旨在统一大型语言模型(LLM)与外部数据源和工具之间的通信协议。MCP是基于JSON-RPC 2.0的协议,是一种Client/Server架构,提供了多种语言(Java、TypeScipt、Python、Kotlin)的MCP Client SDK和MCP Server SDK。Q:MCP Server 和我们平时说的后端服务器(Backend Server)是一个东西吗?A:有关联,角色定位不同。可以理解为后端服务器是一个“后厨”,负责做菜,执行业务逻辑,访问数据库,和调用算法等等。而MCP Server则是站在后厨门口的“专业点餐员”,它懂AI助手语言,就是MCP协议,负责接收AI订单,并翻译给“后厨”,也就是后端服务器。Q:链式调用多个工具的MCP工作流时(比如:先搜索资料 -> 再总结 -> 最后发邮件), 我应该在Client(AI助手)层面编排这些步骤,还是在MCP Server内部封装这个完整流程? 这两种方案各有什么优劣?A:优势:灵活性高,可解释性强;缺点:依赖AI能力,需要有较强的AI逻辑能力。Q:请问开发者空间云主机免费使用时长是总共180小时,还是每年180小时免费使用时长?A:目前是每年180小时免费使用时长。Q:自己尝试做一个 MCP Server,需要什么准备?难不难?A:( 不难 )入门并不难!你只需要:基础知识:基本的Python或Node.js编程能力。核心环境:一个AI助手平台(如Cursor、Cherry Studio、Claude Desktop等)作为MCP Client。你可以从做一个最简单的“天气查询Server”或“备忘录Server”开始,一两个小时就能跑通第一个例子,体验非常好!华为开发者空间,让开发者低门槛体验华为工具和资源,是为全球开发者打造的专属开发空间,汇聚昇腾、鸿蒙、鲲鹏、GaussDB、欧拉等各项根技术的开发资源及工具,致力于为每位开发者提供一台云主机、一套开发工具和云上存储空间,为开发者提供AI时代的智能应用开发体验,集成AI原生应用引擎,开发者可一键生成智能Agent,调用MCP Server插件能力,快速构建个性化AI应用。快来学习体验吧!
-
使用transformer的Deepseed进行单机多卡训练时算子报错。不使用Deepseed时可以正常训练模型: Qwen2.5-VL 3B镜像:pytorch_2.1.0-cann_8.0.rc2-py_3.9-euler_2.10.7-aarch64-snt9b机器:2卡910B2环境:torch 2.4.0torch-npu 2.4.0.post2torchvision 0.19.0transformers 4.51.3deepspeed 0.16.5启动命令:ASCEND_LAUNCH_BLOCKING=1 accelerate launch --config_file deep_config.yaml engine.py deep_config.yaml:compute_environment: LOCAL_MACHINEdistributed_type: DEEPSPEEDdeepspeed_config: path: ds_config.jsondebug: falsegpu_ids: "0,1"num_processes: 2use_cpu: falsenum_machines: 1machine_rank: 0same_network: truerdzv_backend: staticmain_training_function: mainmain_process_port: 29503tpu_env: []tpu_use_cluster: falsetpu_use_sudo: false deepconfig.json:{ "train_micro_batch_size_per_gpu": 1, "gradient_accumulation_steps": 1, "gradient_clipping": 1.0, "bf16": { "enabled": true }, "fp16": { "enabled": false }, "zero_optimization": { "stage": 2, "contiguous_gradients": true, "overlap_comm": true, "reduce_scatter": true, "reduce_bucket_size": 5e8, "allgather_bucket_size": 5e8 }, "optimizer": { "type": "AdamW", "params": { "lr": 5e-5, "betas": [0.9, 0.999], "eps": 1e-8, "weight_decay": 0.01 } }, "scheduler": { "type": "WarmupLR", "params": { "warmup_min_lr": 0.0, "warmup_max_lr": 5e-5, "warmup_num_steps": 0 } }, "activation_checkpointing": { "partition_activations": false, "contiguous_memory_optimization": false, "cpu_checkpointing": false }, "wall_clock_breakdown": false } 报错信息:loss ar: 59.6875, computed! performing backward pass...[rank1]: Traceback (most recent call last):[rank1]: File "/home/ma-user/work/DAR/engine.py", line 179, in <module>[rank1]: dar_trainer.train()[rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/transformers/trainer.py", line 2245, in train[rank1]: return inner_training_loop([rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/transformers/trainer.py", line 2560, in _inner_training_loop[rank1]: tr_loss_step = self.training_step(model, inputs, num_items_in_batch)[rank1]: File "/home/ma-user/work/DAR/dar_trainer.py", line 35, in training_step[rank1]: return self.consistency_training_step(model, inputs)[rank1]: File "/home/ma-user/work/DAR/dar_trainer.py", line 78, in consistency_training_step[rank1]: self.accelerator.backward(loss_ar)[rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/accelerate/accelerator.py", line 2446, in backward[rank1]: self.deepspeed_engine_wrapped.backward(loss, **kwargs)[rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/accelerate/utils/deepspeed.py", line 266, in backward[rank1]: self.engine.backward(loss, **kwargs)[rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 20, in wrapped_fn[rank1]: ret_val = func(*args, **kwargs)[rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 2187, in backward[rank1]: self._do_optimizer_backward(loss, retain_graph)[rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 2133, in _do_optimizer_backward[rank1]: self.optimizer.backward(loss, retain_graph=retain_graph)[rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 2089, in backward[rank1]: self.loss_scaler.backward(loss.float(), retain_graph=retain_graph)[rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/fp16/loss_scaler.py", line 63, in backward[rank1]: scaled_loss.backward(retain_graph=retain_graph)[rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/torch/_tensor.py", line 521, in backward[rank1]: torch.autograd.backward([rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/torch/autograd/__init__.py", line 289, in backward[rank1]: _engine_run_backward([rank1]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/torch/autograd/graph.py", line 768, in _engine_run_backward[rank1]: return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass[rank1]: RuntimeError: InnerRun:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:218 OPS function error: Conv3DBackpropFilter, error code is 500002[rank1]: [ERROR] 2025-09-02-22:43:34 (PID:1305427, Device:1, RankID:1) ERR01100 OPS call acl api failed[rank1]: [Error]: A GE error occurs in the system.[rank1]: Rectify the fault based on the error information in the ascend log.[rank1]: E69999: Inner Error![rank1]: E69999: [PID: 1305427] 2025-09-02-22:43:34.306.673 op[Conv3DBackpropFilter3], illegal format of x.[FUNC:Conv3DBackpropFilterInfer][FILE:nn_calculation_ops.cc][LINE:9783][rank1]: TraceBack (most recent call last):[rank1]: Sessin_id 0 does not exist, graph_id 2[FUNC:GetJsonObject][FILE:analyzer.cc][LINE:155][rank1]: Param:graph_info is nullptr, check invalid[FUNC:DoAnalyze][FILE:analyzer.cc][LINE:253][rank1]: Param:graph_info is nullptr, check invalid[FUNC:SaveAnalyzerDataToFile][FILE:analyzer.cc][LINE:210][rank1]: Call InferShapeAndType for node:Conv3DBackpropFilter3(Conv3DBackpropFilter) failed[FUNC:Infer][FILE:infershape_pass.cc][LINE:117][rank1]: process pass InferShapePass on node:Conv3DBackpropFilter3 failed, ret:4294967295[FUNC:RunPassesOnNode][FILE:base_pass.cc][LINE:563][rank1]: build graph failed, graph id:2, ret:1343242270[FUNC:BuildModelWithGraphId][FILE:ge_generator.cc][LINE:1615][rank1]: [Build][SingleOpModel]call ge interface generator.BuildSingleOpModel failed. ge result = 1343242270[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161][rank1]: [Build][Op]Fail to build op model[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145][rank1]: build op model failed, result = 500002[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145]Traceback (most recent call last): File "/home/ma-user/work/DAR/engine.py", line 179, in <module> dar_trainer.train() File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/transformers/trainer.py", line 2245, in train return inner_training_loop( File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/transformers/trainer.py", line 2560, in _inner_training_loop tr_loss_step = self.training_step(model, inputs, num_items_in_batch) File "/home/ma-user/work/DAR/dar_trainer.py", line 35, in training_step return self.consistency_training_step(model, inputs) File "/home/ma-user/work/DAR/dar_trainer.py", line 78, in consistency_training_step self.accelerator.backward(loss_ar) File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/accelerate/accelerator.py", line 2446, in backward self.deepspeed_engine_wrapped.backward(loss, **kwargs) File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/accelerate/utils/deepspeed.py", line 266, in backward self.engine.backward(loss, **kwargs) File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 20, in wrapped_fn ret_val = func(*args, **kwargs) File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 2187, in backward self._do_optimizer_backward(loss, retain_graph) File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 2133, in _do_optimizer_backward self.optimizer.backward(loss, retain_graph=retain_graph) File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 2089, in backward self.loss_scaler.backward(loss.float(), retain_graph=retain_graph) File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/fp16/loss_scaler.py", line 63, in backward scaled_loss.backward(retain_graph=retain_graph) File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/torch/_tensor.py", line 521, in backward torch.autograd.backward( File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/torch/autograd/__init__.py", line 289, in backward _engine_run_backward( File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/torch/autograd/graph.py", line 768, in _engine_run_backward return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward passRuntimeError: InnerRun:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:218 OPS function error: Conv3DBackpropFilter, error code is 500002[ERROR] 2025-09-02-22:43:34 (PID:1305426, Device:0, RankID:0) ERR01100 OPS call acl api failed[Error]: A GE error occurs in the system. Rectify the fault based on the error information in the ascend log.E69999: Inner Error!E69999: [PID: 1305426] 2025-09-02-22:43:34.341.828 op[Conv3DBackpropFilter3], illegal format of x.[FUNC:Conv3DBackpropFilterInfer][FILE:nn_calculation_ops.cc][LINE:9783] TraceBack (most recent call last): Sessin_id 0 does not exist, graph_id 2[FUNC:GetJsonObject][FILE:analyzer.cc][LINE:155] Param:graph_info is nullptr, check invalid[FUNC:DoAnalyze][FILE:analyzer.cc][LINE:253] Param:graph_info is nullptr, check invalid[FUNC:SaveAnalyzerDataToFile][FILE:analyzer.cc][LINE:210] Call InferShapeAndType for node:Conv3DBackpropFilter3(Conv3DBackpropFilter) failed[FUNC:Infer][FILE:infershape_pass.cc][LINE:117] process pass InferShapePass on node:Conv3DBackpropFilter3 failed, ret:4294967295[FUNC:RunPassesOnNode][FILE:base_pass.cc][LINE:563] build graph failed, graph id:2, ret:1343242270[FUNC:BuildModelWithGraphId][FILE:ge_generator.cc][LINE:1615] [Build][SingleOpModel]call ge interface generator.BuildSingleOpModel failed. ge result = 1343242270[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161] [Build][Op]Fail to build op model[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145] build op model failed, result = 500002[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145][rank0]: Traceback (most recent call last):[rank0]: File "/home/ma-user/work/DAR/engine.py", line 179, in <module>[rank0]: dar_trainer.train()[rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/transformers/trainer.py", line 2245, in train[rank0]: return inner_training_loop([rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/transformers/trainer.py", line 2560, in _inner_training_loop[rank0]: tr_loss_step = self.training_step(model, inputs, num_items_in_batch)[rank0]: File "/home/ma-user/work/DAR/dar_trainer.py", line 35, in training_step[rank0]: return self.consistency_training_step(model, inputs)[rank0]: File "/home/ma-user/work/DAR/dar_trainer.py", line 78, in consistency_training_step[rank0]: self.accelerator.backward(loss_ar)[rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/accelerate/accelerator.py", line 2446, in backward[rank0]: self.deepspeed_engine_wrapped.backward(loss, **kwargs)[rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/accelerate/utils/deepspeed.py", line 266, in backward[rank0]: self.engine.backward(loss, **kwargs)[rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 20, in wrapped_fn[rank0]: ret_val = func(*args, **kwargs)[rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 2187, in backward[rank0]: self._do_optimizer_backward(loss, retain_graph)[rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 2133, in _do_optimizer_backward[rank0]: self.optimizer.backward(loss, retain_graph=retain_graph)[rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py", line 2089, in backward[rank0]: self.loss_scaler.backward(loss.float(), retain_graph=retain_graph)[rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/deepspeed/runtime/fp16/loss_scaler.py", line 63, in backward[rank0]: scaled_loss.backward(retain_graph=retain_graph)[rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/torch/_tensor.py", line 521, in backward[rank0]: torch.autograd.backward([rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/torch/autograd/__init__.py", line 289, in backward[rank0]: _engine_run_backward([rank0]: File "/home/ma-user/anaconda3/envs/llamafactory/lib/python3.10/site-packages/torch/autograd/graph.py", line 768, in _engine_run_backward[rank0]: return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass[rank0]: RuntimeError: InnerRun:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:218 OPS function error: Conv3DBackpropFilter, error code is 500002[rank0]: [ERROR] 2025-09-02-22:43:34 (PID:1305426, Device:0, RankID:0) ERR01100 OPS call acl api failed[rank0]: [Error]: A GE error occurs in the system.[rank0]: Rectify the fault based on the error information in the ascend log.[rank0]: E69999: Inner Error![rank0]: E69999: [PID: 1305426] 2025-09-02-22:43:34.341.828 op[Conv3DBackpropFilter3], illegal format of x.[FUNC:Conv3DBackpropFilterInfer][FILE:nn_calculation_ops.cc][LINE:9783][rank0]: TraceBack (most recent call last):[rank0]: Sessin_id 0 does not exist, graph_id 2[FUNC:GetJsonObject][FILE:analyzer.cc][LINE:155][rank0]: Param:graph_info is nullptr, check invalid[FUNC:DoAnalyze][FILE:analyzer.cc][LINE:253][rank0]: Param:graph_info is nullptr, check invalid[FUNC:SaveAnalyzerDataToFile][FILE:analyzer.cc][LINE:210][rank0]: Call InferShapeAndType for node:Conv3DBackpropFilter3(Conv3DBackpropFilter) failed[FUNC:Infer][FILE:infershape_pass.cc][LINE:117][rank0]: process pass InferShapePass on node:Conv3DBackpropFilter3 failed, ret:4294967295[FUNC:RunPassesOnNode][FILE:base_pass.cc][LINE:563][rank0]: build graph failed, graph id:2, ret:1343242270[FUNC:BuildModelWithGraphId][FILE:ge_generator.cc][LINE:1615][rank0]: [Build][SingleOpModel]call ge interface generator.BuildSingleOpModel failed. ge result = 1343242270[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161][rank0]: [Build][Op]Fail to build op model[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145][rank0]: build op model failed, result = 500002[FUNC:ReportInnerError][FILE:log_inner.cpp][LINE:145]
-
松材线虫病检测1. 数据切分无人机广角拍摄的影像分辨率较高(4000x3000),首先对人工标注好的松材线虫病数据集进行切分,将大图切分成小图并设置不同的切分尺寸(例如:1000x1000、1500x1500、2000x2000)和重叠比例(例如:0%、10%、20%、30%)送入模型进行训练。2. 模型训练YOLOv8自2023年推出后,经过多次优化迭代,其架构设计(如C2F模块、动态标签分配)与训练流程已趋于成熟。例如嵌入式设备依赖v8的轻量化特效,在医疗检测领域,v8的高召回率已被临床验证。YOLO12等虽在理论上超越YOLOv8,但是v8的推理速度仍具不可替代性,目前在工业界广泛采用该版本进行部署。我们使用YOLOv8对等比例缩放后的原始图像和切分后的松材线虫病检测数据集进行训练,提高模型对不同大小目标的泛化能力,每次迭代训练s和m两种尺寸的模型,分别用于视频直播检测和图像的自动标注。目前我们的模型已经适配国产昇腾和英伟达的算力卡,可以实现模型的自动化训练作业,并针对不同算力芯片进行模型的自动转换和量化。3. 云上标注我们的模型可以对无人机回传的图片和视频进行切分检测和自动标注,针对不同大小的目标和类别可以设置不同的切分尺寸和重叠比例,实现无人机影像的细粒度检测。4. 直播推理我们的AI直播推理业务Pipeline并发运行,使用Python结合C++进行开发,功能模块化,业务运行更高效,可以在RK3588、Jetson系列开发板上进行部署。目前针对松材线虫病检测的场景,已经支持对9种疫木的实时识别。----转自博客:https://bbs.huaweicloud.com/blogs/458003
-
传闻华为昇腾NPU转向GPGPU,大家对此有什么看法?可以畅所欲言!
-
华为云开发者空间☁️昇腾NPU实现AI工业质检本案例将在华为云开发者空间工作台⚙️AI Notebook 中使用免费的🎉 昇腾 NPU 910B ✨完成YOLO11模型训练,利用SAHI切片辅助超推理框架实现PCB缺陷检测。1. 下载模型和数据集📦首先在Notebook的代码块中粘贴并运行下面的代码,下载解压本案例所需的训练数据和模型文件:import os import zipfile if not os.path.exists('yolo11_train_ascend.zip'): os.system('wget -q https://orangepi-ascend.obs.cn-north-4.myhuaweicloud.com/yolo11_train_ascend.zip') if not os.path.exists('yolo11_train_ascend'): zip_file = zipfile.ZipFile('yolo11_train_ascend.zip') zip_file.extractall() zip_file.close() 2. 安装依赖包🛠️安装YOLO11所需的依赖包以及SAHI库,构建项目的运行环境:!pip install ultralytics==8.3.160 ultralytics-thop==2.0.14 sahi==0.11.26 numpy==1.26.4 Defaulting to user installation because normal site-packages is not writeable Looking in indexes: https://mirrors.huaweicloud.com/repository/pypi/simple Collecting ultralytics==8.3.160 Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/7b/8d/924524ff26c0ed0ba43b90cc598887e2b06f3bf00dd51a505a754ecb138d/ultralytics-8.3.160-py3-none-any.whl (1.0 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 9.8 MB/s eta 0:00:00 Collecting ultralytics-thop==2.0.14 Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/a6/10/251f036b4c5d77249f9a119cc89dafe8745dc1ad1f1a5f06b6a3988ca454/ultralytics_thop-2.0.14-py3-none-any.whl (26 kB) Collecting sahi==0.11.26 Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/5e/8a/9782c8088af52e6f41fee59c77b5117783c0d6eafde45c96ca3912ec197f/sahi-0.11.26-py3-none-any.whl (115 kB) Collecting numpy==1.26.4 Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/fc/a5/4beee6488160798683eed5bdb7eead455892c3b4e1f78d79d8d3f3b084ac/numpy-1.26.4-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (14.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.2/14.2 MB 42.9 MB/s eta 0:00:00 0:00:01 Requirement already satisfied: matplotlib>=3.3.0 in /home/service/.local/lib/python3.10/site-packages (from ultralytics==8.3.160) (3.10.0) Requirement already satisfied: opencv-python>=4.6.0 in /home/service/.local/lib/python3.10/site-packages (from ultralytics==8.3.160) (4.10.0.84) Requirement already satisfied: pillow>=7.1.2 in /usr/local/python3.10/lib/python3.10/site-packages (from ultralytics==8.3.160) (11.0.0) Requirement already satisfied: pyyaml>=5.3.1 in /usr/local/python3.10/lib/python3.10/site-packages (from ultralytics==8.3.160) (6.0.2) Requirement already satisfied: requests>=2.23.0 in /home/service/.local/lib/python3.10/site-packages (from ultralytics==8.3.160) (2.32.3) Requirement already satisfied: scipy>=1.4.1 in /usr/local/python3.10/lib/python3.10/site-packages (from ultralytics==8.3.160) (1.14.1) Requirement already satisfied: torch>=1.8.0 in /usr/local/python3.10/lib/python3.10/site-packages (from ultralytics==8.3.160) (2.1.0) Requirement already satisfied: torchvision>=0.9.0 in /home/service/.local/lib/python3.10/site-packages (from ultralytics==8.3.160) (0.16.0) Requirement already satisfied: tqdm>=4.64.0 in /usr/local/python3.10/lib/python3.10/site-packages (from ultralytics==8.3.160) (4.67.1) Requirement already satisfied: psutil in /home/service/.local/lib/python3.10/site-packages (from ultralytics==8.3.160) (5.9.8) Requirement already satisfied: py-cpuinfo in /home/service/.local/lib/python3.10/site-packages (from ultralytics==8.3.160) (9.0.0) Requirement already satisfied: pandas>=1.1.4 in /usr/local/python3.10/lib/python3.10/site-packages (from ultralytics==8.3.160) (2.2.3) Requirement already satisfied: click in /usr/local/python3.10/lib/python3.10/site-packages (from sahi==0.11.26) (8.1.8) Collecting fire (from sahi==0.11.26) Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/6b/b6/82c7e601d6d3c3278c40b7bd35e17e82aa227f050aa9f66cb7b7fce29471/fire-0.7.0.tar.gz (87 kB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting pybboxes==0.1.6 (from sahi==0.11.26) Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/3c/3f/46f6613b41a3c2b4f7af3b526035771ca5bb12d6fdf3b23145899f785e36/pybboxes-0.1.6-py3-none-any.whl (24 kB) Collecting shapely>=2.0.0 (from sahi==0.11.26) Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/29/51/0b158a261df94e33505eadfe737db9531f346dfa60850945ad25fd4162f1/shapely-2.1.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.9 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.9/2.9 MB 22.0 MB/s eta 0:00:00 Collecting terminaltables (from sahi==0.11.26) Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/c4/fb/ea621e0a19733e01fe4005d46087d383693c0f4a8f824b47d8d4122c87e0/terminaltables-3.1.10-py2.py3-none-any.whl (15 kB) Requirement already satisfied: contourpy>=1.0.1 in /home/service/.local/lib/python3.10/site-packages (from matplotlib>=3.3.0->ultralytics==8.3.160) (1.3.1) Requirement already satisfied: cycler>=0.10 in /home/service/.local/lib/python3.10/site-packages (from matplotlib>=3.3.0->ultralytics==8.3.160) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /home/service/.local/lib/python3.10/site-packages (from matplotlib>=3.3.0->ultralytics==8.3.160) (4.55.3) Requirement already satisfied: kiwisolver>=1.3.1 in /home/service/.local/lib/python3.10/site-packages (from matplotlib>=3.3.0->ultralytics==8.3.160) (1.4.8) Requirement already satisfied: packaging>=20.0 in /usr/local/python3.10/lib/python3.10/site-packages (from matplotlib>=3.3.0->ultralytics==8.3.160) (24.2) Requirement already satisfied: pyparsing>=2.3.1 in /home/service/.local/lib/python3.10/site-packages (from matplotlib>=3.3.0->ultralytics==8.3.160) (3.2.0) Requirement already satisfied: python-dateutil>=2.7 in /usr/local/python3.10/lib/python3.10/site-packages (from matplotlib>=3.3.0->ultralytics==8.3.160) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in /usr/local/python3.10/lib/python3.10/site-packages (from pandas>=1.1.4->ultralytics==8.3.160) (2024.2) Requirement already satisfied: tzdata>=2022.7 in /usr/local/python3.10/lib/python3.10/site-packages (from pandas>=1.1.4->ultralytics==8.3.160) (2024.2) Requirement already satisfied: charset-normalizer<4,>=2 in /home/service/.local/lib/python3.10/site-packages (from requests>=2.23.0->ultralytics==8.3.160) (3.4.1) Requirement already satisfied: idna<4,>=2.5 in /usr/local/python3.10/lib/python3.10/site-packages (from requests>=2.23.0->ultralytics==8.3.160) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /home/service/.local/lib/python3.10/site-packages (from requests>=2.23.0->ultralytics==8.3.160) (2.3.0) Requirement already satisfied: certifi>=2017.4.17 in /home/service/.local/lib/python3.10/site-packages (from requests>=2.23.0->ultralytics==8.3.160) (2024.12.14) Requirement already satisfied: filelock in /usr/local/python3.10/lib/python3.10/site-packages (from torch>=1.8.0->ultralytics==8.3.160) (3.16.1) Requirement already satisfied: typing-extensions in /usr/local/python3.10/lib/python3.10/site-packages (from torch>=1.8.0->ultralytics==8.3.160) (4.12.2) Requirement already satisfied: sympy in /usr/local/python3.10/lib/python3.10/site-packages (from torch>=1.8.0->ultralytics==8.3.160) (1.13.3) Requirement already satisfied: networkx in /usr/local/python3.10/lib/python3.10/site-packages (from torch>=1.8.0->ultralytics==8.3.160) (3.4.2) Requirement already satisfied: jinja2 in /usr/local/python3.10/lib/python3.10/site-packages (from torch>=1.8.0->ultralytics==8.3.160) (3.1.5) Requirement already satisfied: fsspec in /home/service/.local/lib/python3.10/site-packages (from torch>=1.8.0->ultralytics==8.3.160) (2024.9.0) Collecting termcolor (from fire->sahi==0.11.26) Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/4f/bd/de8d508070629b6d84a30d01d57e4a65c69aa7f5abe7560b8fad3b50ea59/termcolor-3.1.0-py3-none-any.whl (7.7 kB) Requirement already satisfied: six>=1.5 in /usr/local/python3.10/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib>=3.3.0->ultralytics==8.3.160) (1.16.0) Requirement already satisfied: MarkupSafe>=2.0 in /home/service/.local/lib/python3.10/site-packages (from jinja2->torch>=1.8.0->ultralytics==8.3.160) (3.0.2) Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/python3.10/lib/python3.10/site-packages (from sympy->torch>=1.8.0->ultralytics==8.3.160) (1.3.0) Building wheels for collected packages: fire Building wheel for fire (pyproject.toml) ... done Created wheel for fire: filename=fire-0.7.0-py3-none-any.whl size=114330 sha256=a1f27d511635da524f8f51fa2d35ae22862e400cec55285acbc05ced6ef91371 Stored in directory: /home/service/.cache/pip/wheels/9b/dc/c7/06491fe82713723ab64494dbcfd521bdbe80cf26b5fcb5f564 Successfully built fire Installing collected packages: terminaltables, termcolor, numpy, shapely, pybboxes, fire, ultralytics-thop, sahi, ultralytics Attempting uninstall: numpy Found existing installation: numpy 1.24.4 Uninstalling numpy-1.24.4: Successfully uninstalled numpy-1.24.4 WARNING: The script f2py is installed in '/home/service/.local/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The script sahi is installed in '/home/service/.local/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. WARNING: The scripts ultralytics and yolo are installed in '/home/service/.local/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. gradio 5.9.1 requires markupsafe~=2.0, but you have markupsafe 3.0.2 which is incompatible. openmind 0.9.1 requires datasets<=2.21.0,>=2.18.0, but you have datasets 3.2.0 which is incompatible. openmind 0.9.1 requires openmind-hub==0.9.0, but you have openmind-hub 0.9.1 which is incompatible. openmind-datasets 0.7.1 requires datasets==2.18.0, but you have datasets 3.2.0 which is incompatible. openmind-evaluate 0.7.0 requires datasets==2.18.0, but you have datasets 3.2.0 which is incompatible. Successfully installed fire-0.7.0 numpy-1.26.4 pybboxes-0.1.6 sahi-0.11.26 shapely-2.1.1 termcolor-3.1.0 terminaltables-3.1.10 ultralytics-8.3.160 ultralytics-thop-2.0.14 [notice] A new release of pip is available: 24.3.1 -> 25.1.1 [notice] To update, run: pip install --upgrade pip3. 修改配置文件📝我们在配置文件中指定数据集路径和类别等信息,用于后续模型的训练:%%writefile yolo11_train_ascend/pcb.yaml # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..] path: /opt/huawei/edu-apaas/src/init/yolo11_train_ascend/pcb_sliced # dataset root dir (absolute path) train: images/train # train images (relative to 'path') val: images/val # val images (relative to 'path') test: # test images (optional) # Classes,类别 names: 0: mouse_bite 1: open_circuit 2: short 3: spur 4: spurious_copperWriting yolo11_train_ascend/pcb.yaml4. 下载 Arial.ttf 字体🖋️为了避免影响训练进展,可以先提前下载字体文件并拷贝到 /home/service/.config/Ultralytics 路径下。!wget https://orangepi-ascend.obs.cn-north-4.myhuaweicloud.com/Arial.ttf !mkdir -p /home/service/.config/Ultralytics !cp Arial.ttf /home/service/.config/Ultralytics/Arial.ttf--2025-06-28 05:55:59-- https://pcb-sahi-public.obs.cn-southwest-2.myhuaweicloud.com/Arial.ttf Resolving pcb-sahi-public.obs.cn-southwest-2.myhuaweicloud.com (pcb-sahi-public.obs.cn-southwest-2.myhuaweicloud.com)... 100.125.6.3, 100.125.7.3, 100.125.6.131 Connecting to pcb-sahi-public.obs.cn-southwest-2.myhuaweicloud.com (pcb-sahi-public.obs.cn-southwest-2.myhuaweicloud.com)|100.125.6.3|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 773236 (755K) [application/x-font-ttf] Saving to: 'Arial.ttf' Arial.ttf 100%[===================>] 755.11K --.-KB/s in 0.004s 2025-06-28 05:55:59 (188 MB/s) - 'Arial.ttf' saved [773236/773236] 5. 模型训练🧠🔥我们使用yolo11n.pt预训练模型,利用昇腾NPU进行模型加速,设置模型的训练次数为10轮、图像的大小为640x640、开启8个数据加载线程每次送入模型32张图像进行迭代优化。%cd yolo11_train_ascend import torch import torch_npu from torch_npu.contrib import transfer_to_npu from ultralytics import YOLO # Load a model model = YOLO('yolo11n.pt') # load a pretrained model (recommended for training) # Train the model results = model.train(data='pcb.yaml', epochs=10, imgsz=640, workers=8, batch=32) %cd .. /home/service/.local/lib/python3.10/site-packages/IPython/core/magics/osm.py:417: UserWarning: This is now an optional IPython functionality, setting dhist requires you to install the `pickleshare` library. self.shell.db['dhist'] = compress_dhist(dhist)[-100:] /opt/huawei/edu-apaas/src/init/yolo11_train_ascend /home/service/.local/lib/python3.10/site-packages/torch_npu/utils/path_manager.py:82: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/latest owner does not match the current user. warnings.warn(f"Warning: The {path} owner does not match the current user.") /home/service/.local/lib/python3.10/site-packages/torch_npu/utils/path_manager.py:82: UserWarning: Warning: The /usr/local/Ascend/ascend-toolkit/8.0.RC3/aarch64-linux/ascend_toolkit_install.info owner does not match the current user. warnings.warn(f"Warning: The {path} owner does not match the current user.") /home/service/.local/lib/python3.10/site-packages/torch_npu/contrib/transfer_to_npu.py:301: ImportWarning: ************************************************************************************************************* The torch.Tensor.cuda and torch.nn.Module.cuda are replaced with torch.Tensor.npu and torch.nn.Module.npu now.. The torch.cuda.DoubleTensor is replaced with torch.npu.FloatTensor cause the double type is not supported now.. The backend in torch.distributed.init_process_group set to hccl now.. The torch.cuda.* and torch.cuda.amp.* are replaced with torch.npu.* and torch.npu.amp.* now.. The device parameters have been replaced with npu in the function below: torch.logspace, torch.randint, torch.hann_window, torch.rand, torch.full_like, torch.ones_like, torch.rand_like, torch.randperm, torch.arange, torch.frombuffer, torch.normal, torch._empty_per_channel_affine_quantized, torch.empty_strided, torch.empty_like, torch.scalar_tensor, torch.tril_indices, torch.bartlett_window, torch.ones, torch.sparse_coo_tensor, torch.randn, torch.kaiser_window, torch.tensor, torch.triu_indices, torch.as_tensor, torch.zeros, torch.randint_like, torch.full, torch.eye, torch._sparse_csr_tensor_unsafe, torch.empty, torch._sparse_coo_tensor_unsafe, torch.blackman_window, torch.zeros_like, torch.range, torch.sparse_csr_tensor, torch.randn_like, torch.from_file, torch._cudnn_init_dropout_state, torch._empty_affine_quantized, torch.linspace, torch.hamming_window, torch.empty_quantized, torch._pin_memory, torch.autocast, torch.load, torch.Generator, torch.Tensor.new_empty, torch.Tensor.new_empty_strided, torch.Tensor.new_full, torch.Tensor.new_ones, torch.Tensor.new_tensor, torch.Tensor.new_zeros, torch.Tensor.to, torch.nn.Module.to, torch.nn.Module.to_empty ************************************************************************************************************* warnings.warn(msg, ImportWarning) /home/service/.local/lib/python3.10/site-packages/torch_npu/contrib/transfer_to_npu.py:260: RuntimeWarning: torch.jit.script and torch.jit.script_method will be disabled by transfer_to_npu, which currently does not support them, if you need to enable them, please do not use transfer_to_npu. warnings.warn(msg, RuntimeWarning) Creating new Ultralytics Settings v0.0.6 file ✅ View Ultralytics Settings with 'yolo settings' or at '/home/service/.config/Ultralytics/settings.json' Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings. [W compiler_depend.ts:623] Warning: expandable_segments currently defaults to false. You can enable this feature by `export PYTORCH_NPU_ALLOC_CONF = expandable_segments:True`. (function operator()) Ultralytics 8.3.160 🚀 Python-3.10.15 torch-2.1.0 CUDA:0 (Ascend910B3, 62432MiB) engine/trainer: agnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=32, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=pcb.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolo11n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/train, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=True, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None Overriding model.yaml nc=80 with nc=5 from n params module arguments 0 -1 1 464 ultralytics.nn.modules.conv.Conv [3, 16, 3, 2] 1 -1 1 4672 ultralytics.nn.modules.conv.Conv [16, 32, 3, 2] 2 -1 1 6640 ultralytics.nn.modules.block.C3k2 [32, 64, 1, False, 0.25] 3 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2] 4 -1 1 26080 ultralytics.nn.modules.block.C3k2 [64, 128, 1, False, 0.25] 5 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2] 6 -1 1 87040 ultralytics.nn.modules.block.C3k2 [128, 128, 1, True] 7 -1 1 295424 ultralytics.nn.modules.conv.Conv [128, 256, 3, 2] 8 -1 1 346112 ultralytics.nn.modules.block.C3k2 [256, 256, 1, True] 9 -1 1 164608 ultralytics.nn.modules.block.SPPF [256, 256, 5] 10 -1 1 249728 ultralytics.nn.modules.block.C2PSA [256, 256, 1] 11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 12 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1] 13 -1 1 111296 ultralytics.nn.modules.block.C3k2 [384, 128, 1, False] 14 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 15 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1] 16 -1 1 32096 ultralytics.nn.modules.block.C3k2 [256, 64, 1, False] 17 -1 1 36992 ultralytics.nn.modules.conv.Conv [64, 64, 3, 2] 18 [-1, 13] 1 0 ultralytics.nn.modules.conv.Concat [1] 19 -1 1 86720 ultralytics.nn.modules.block.C3k2 [192, 128, 1, False] 20 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2] 21 [-1, 10] 1 0 ultralytics.nn.modules.conv.Concat [1] 22 -1 1 378880 ultralytics.nn.modules.block.C3k2 [384, 256, 1, True] 23 [16, 19, 22] 1 431647 ultralytics.nn.modules.head.Detect [5, [64, 128, 256]] /home/service/.local/lib/python3.10/site-packages/torch_npu/utils/storage.py:38: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() if self.device.type != 'cpu': YOLO11n summary: 181 layers, 2,590,815 parameters, 2,590,799 gradients, 6.4 GFLOPs Transferred 448/499 items from pretrained weights Freezing layer 'model.23.dfl.conv.weight' AMP: running Automatic Mixed Precision (AMP) checks... [W compiler_depend.ts:51] Warning: CAUTION: The operator 'torchvision::nms' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback) AMP: checks passed ✅ train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 620.7±42.3 MB/s, size: 454.2 KB) train: Scanning /opt/huawei/edu-apaas/src/init/yolo11_train_ascend/pcb_sliced/labels/train... 4646 images, 0 backgrounds, 0 corrupt: 100%|██████████| 4646/4646 [00:05<00:00, 848.20it/s] train: New cache created: /opt/huawei/edu-apaas/src/init/yolo11_train_ascend/pcb_sliced/labels/train.cache val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 471.4±135.5 MB/s, size: 448.2 KB) val: Scanning /opt/huawei/edu-apaas/src/init/yolo11_train_ascend/pcb_sliced/labels/val... 422 images, 0 backgrounds, 0 corrupt: 100%|██████████| 422/422 [00:00<00:00, 520.44it/s] val: New cache created: /opt/huawei/edu-apaas/src/init/yolo11_train_ascend/pcb_sliced/labels/val.cache Plotting labels to runs/detect/train/labels.jpg... optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... optimizer: AdamW(lr=0.001111, momentum=0.9) with parameter groups 81 weight(decay=0.0), 88 weight(decay=0.0005), 87 bias(decay=0.0) Image sizes 640 train, 640 val Using 8 dataloader workers Logging results to runs/detect/train Starting training for 10 epochs... Closing dataloader mosaic Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 0%| | 0/146 [00:00<?, ?it/s] . /home/service/.local/lib/python3.10/site-packages/ultralytics/utils/tal.py:274: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at build/CMakeFiles/torch_npu.dir/compiler_depend.ts:74.) target_scores = torch.where(fg_scores_mask > 0, target_scores, 0) [W compiler_depend.ts:103] Warning: Non finite check and unscale on NPU device! (function operator()) 1/10 7.77G 2.238 5.333 1.761 8 640: 100%|██████████| 146/146 [01:31<00:00, 1.60it/s] Class Images Instances Box(P R mAP50 mAP50-95): 0%| | 0/7 [00:00<?, ?it/s] ..... Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:58<00:00, 8.39s/it] all 422 604 0.39 0.0656 0.0888 0.023 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 2/10 8.2G 1.876 2.724 1.462 8 640: 100%|██████████| 146/146 [01:16<00:00, 1.92it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:04<00:00, 1.58it/s] all 422 604 0.451 0.238 0.214 0.0639 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 3/10 8.2G 1.825 1.912 1.445 8 640: 100%|██████████| 146/146 [01:12<00:00, 2.00it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:04<00:00, 1.64it/s] all 422 604 0.339 0.291 0.244 0.0742 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 4/10 8.2G 1.748 1.571 1.398 4 640: 100%|██████████| 146/146 [01:12<00:00, 2.02it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:04<00:00, 1.58it/s] all 422 604 0.409 0.361 0.335 0.117 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 5/10 8.2G 1.703 1.343 1.372 6 640: 100%|██████████| 146/146 [01:11<00:00, 2.05it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:04<00:00, 1.66it/s] all 422 604 0.442 0.34 0.321 0.118 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 6/10 8.2G 1.673 1.26 1.343 5 640: 100%|██████████| 146/146 [01:11<00:00, 2.03it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:04<00:00, 1.62it/s] all 422 604 0.605 0.49 0.53 0.224 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 7/10 8.2G 1.614 1.145 1.316 6 640: 100%|██████████| 146/146 [01:12<00:00, 2.00it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:04<00:00, 1.58it/s] all 422 604 0.595 0.542 0.525 0.206 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 8/10 8.2G 1.578 1.067 1.294 7 640: 100%|██████████| 146/146 [01:11<00:00, 2.03it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:04<00:00, 1.73it/s] all 422 604 0.754 0.629 0.685 0.307 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 9/10 8.21G 1.551 1.009 1.275 8 640: 100%|██████████| 146/146 [01:11<00:00, 2.04it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:04<00:00, 1.57it/s] all 422 604 0.782 0.618 0.703 0.315 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 10/10 8.21G 1.5 0.9621 1.255 6 640: 100%|██████████| 146/146 [01:12<00:00, 2.02it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:03<00:00, 1.83it/s] all 422 604 0.8 0.661 0.732 0.354 10 epochs completed in 0.236 hours. Optimizer stripped from runs/detect/train/weights/last.pt, 5.5MB Optimizer stripped from runs/detect/train/weights/best.pt, 5.5MB Validating runs/detect/train/weights/best.pt... Ultralytics 8.3.160 🚀 Python-3.10.15 torch-2.1.0 CUDA:0 (Ascend910B3, 62432MiB) YOLO11n summary (fused): 100 layers, 2,583,127 parameters, 0 gradients, 6.3 GFLOPs ... Class Images Instances Box(P R mAP50 mAP50-95): 0%| | 0/7 [00:00<?, ?it/s] . Class Images Instances Box(P R mAP50 mAP50-95): 100%|██████████| 7/7 [00:07<00:00, 1.14s/it] all 422 604 0.799 0.663 0.732 0.355 mouse_bite 107 169 0.806 0.785 0.829 0.4 open_circuit 73 101 0.656 0.471 0.492 0.219 short 69 87 0.889 0.54 0.701 0.314 spur 95 134 0.864 0.714 0.76 0.342 spurious_copper 95 113 0.782 0.805 0.88 0.5 Speed: 0.1ms preprocess, 8.3ms inference, 0.0ms loss, 2.5ms postprocess per image Results saved to runs/detect/train /opt/huawei/edu-apaas/src/init /home/service/.local/lib/python3.10/site-packages/IPython/core/magics/osm.py:417: UserWarning: This is now an optional IPython functionality, setting dhist requires you to install the `pickleshare` library. self.shell.db['dhist'] = compress_dhist(dhist)[-100:] 模型训练好后,可以在runs/detect/train目录下查看训练结果,例如损失函数的变化曲线、mAP等评价指标📈💪。6. 图像切分检测✂️🔍最后我们利用SAHI框架对高清PCB图像进行切片推理,从而更精准地检测出PCB的瑕疵类别。import torch import torch_npu from torch_npu.contrib import transfer_to_npu from sahi.predict import get_sliced_prediction from sahi import AutoDetectionModel from PIL import Image detection_model = AutoDetectionModel.from_pretrained( model_type = 'ultralytics', model_path = "yolo11_train_ascend/runs/detect/train/weights/best.pt", confidence_threshold = 0.4, device = "cuda:0" ) 这里我们使用滑窗检测🔍的技术,将原始图像切分成640x640大小的子图🖼️,同时设置一定的重叠度,再分别预测每张子图,最后将所有的检测结果进行合并处理🛠️。image_path = "https://orangepi-ascend.obs.cn-north-4.myhuaweicloud.com/001.bmp" result = get_sliced_prediction( image_path, detection_model, slice_height = 640, slice_width = 640, overlap_height_ratio = 0.1, overlap_width_ratio = 0.1, perform_standard_pred = False, postprocess_class_agnostic = True, postprocess_match_threshold = 0.1, ) result.export_visuals(export_dir="output/", file_name="sliced_result") Image.open("output/sliced_result.png") Performing prediction on 24 slices.可以看到,模型准确无误的预测出PCB缺陷的位置、类别和置信度😄7. 小结📌本案例借助华为云开发者空间💡昇腾910B NPU完成YOLO11模型训练与PCB缺陷检测,并且结合SAHI实现高效切片推理🚀,华为云开发者空间💻AI Notebook开箱即用,大家快来体验吧!🤗 ----转自博客:https://bbs.huaweicloud.com/blogs/455280
-
一、 背景Qwen2-VL-7B-Instruct是通义千问系列中的一款多模态大模型,具备强大得视觉与语言理解能力。它在保持较小体积的同时,提供了出色的视觉理解和语言生成能力,是当前中文多模态AI领域的优秀选择之一。华为开发者空间内置昇腾NPU资源,开发者每天共有两个小时的免费使用时长,本次为开发者带来基于华为开发者空间Notebook部署Qwen2-VL-Instruct模型进行图片理解全流程。二、环境配置及模型部署首先,浏览器进入魔塔社区,获取Qwen2-VL-7B-Instruct模型文件进入华为开发者空间Notebook,进行模型下载,打开终端输入下载模型命令:git clone https://www.modelscope.cn/Qwen/Qwen2.5-VL-7B-Instruct.git模型下载完毕后开始配置环境,随后安装需要的工具包pip install qwen-vl-utils必要安装包下载完毕后,进入模型文件下获取路径。点击左上角 “+” 启动一个代码页,将以下地址复制到代码行中进行工具包的安装。pip install --upgrade transformers peft diffusers accelerate将以下代码复制到代码行中,其中替换模型路径。import os import torch # 设置 NPU 内存优化 os.environ["PYTORCH_NPU_ALLOC_CONF"] = "expandable_segments:True" # 修复 torch.compiler if not hasattr(torch.compiler, 'is_compiling'): torch.compiler.is_compiling = lambda: False import torch_npu from modelscope import Qwen2VLForConditionalGeneration, AutoProcessor from qwen_vl_utils import process_vision_info # Step 1: 确认 NPU 可用 assert torch.npu.is_available(), "NPU not available" # Step 2: 加载模型(使用 bfloat16) model = Qwen2VLForConditionalGeneration.from_pretrained( "/opt/huawei/edu-apaas/src/init/model/Qwen2-VL-7B-Instruct", torch_dtype=torch.bfloat16, device_map=None, trust_remote_code=True ) model = model.eval() # 关闭训练模式 model = model.to("npu:0") # Step 3: 限制图像 token 数量(🔥 关键) min_pixels = 256 * 28 * 28 max_pixels = 1280 * 28 * 28 processor = AutoProcessor.from_pretrained( "/opt/huawei/edu-apaas/src/init/model/Qwen2-VL-7B-Instruct", trust_remote_code=True, use_fast=False, min_pixels=min_pixels, max_pixels=max_pixels ) # Step 4: 构造输入 messages = [ { "role": "user", "content": [ { "type": "image", "image": "/opt/huawei/edu-apaas/src/init/model/Qwen2-VL-7B-Instruct/fed651d4f97246c4_big.jpg", }, {"type": "text", "text": "Describe this image."}, ], } ] text = processor.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) image_inputs, video_inputs = process_vision_info(messages) inputs = processor( text=[text], images=image_inputs, videos=video_inputs, padding=True, return_tensors="pt", ) inputs = inputs.to("npu:0") # Step 5: 推理 with torch.no_grad(): generated_ids = model.generate(**inputs, max_new_tokens=128) # 解码 generated_ids_trimmed = [ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids) ] output_text = processor.batch_decode( generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False ) print("Output:", output_text) 代码复制完毕后,再将需要进行理解的图片上传到Notebook中,并将代码中的图片路径进行替换。替换完毕后,运行,最后结果会将上传的图片进行解释。至此,Qwen2-VL-7B-Instruct模型部署完毕。
-
本期直播聚焦X+AI驱动下的教育革新与产教融合实践,吸引超过7200人次的在线观看,累计社媒播放量突破6000+,本期企业开发者占比56%。
-
一、背景与问题在AI模型开发领域,训练平台的硬件架构差异对模型迁移和性能优化提出了显著挑战。当前主流的PyTorch框架最初针对NVIDIA GPU设计,其代码实现高度依赖CUDA生态(如torch.cuda.*接口、nn.DataParallel并行模式等)。随着国产算力平台的崛起,有越来越多的方式方法,能够将此类模型无缝迁移至国产NPU平台,并兼顾训练稳定性与性能表现。迁移过程中面临的核心问题包括:硬件架构差异:GPU与NPU在内存管理(如显存/NPU内存分配策略)、计算单元(SIMD vs. 张量核)、通信机制(CUDA vs. HCCL)等方面存在本质区别;接口适配复杂度:PyTorch原生接口需逐层替换为NPU适配接口(如 npu() 替代 cuda() ),且部分算子需定制化开发;分布式训练兼容性:多卡通信后端需从NCCL切换为HCCL,同时验证分布式逻辑在NPU平台的稳定性;性能瓶颈定位:需针对NPU特性优化超参数(如batch size、学习率),并量化迁移后的精度与耗时差异。 二、实现原理算子级替换机制接口映射:通过torch.device("npu")指定设备,逐个替换tensor.cuda()为tensor.npu(),并替换分布式接口init_process_group(backend="nccl")为backend="hccl"。算子适配:对未直接映射的CUDA算子,采用官方扩展库(如torch_npu)实现功能替代。环境配置核心逻辑环境变量隔离:通过ASCEND_HOME、LD_LIBRARY_PATH等环境变量明确指向NPU驱动路径,确保PyTorch正确加载NPU运行时依赖。资源分配策略:单卡训练时需根据NPU内存容量调整batch size;多卡场景下通过DistributedDataParallel封装模型,并利用HCCL优化通信效率。分布式训练优化通信后端切换:HCCL(华为高性能通信库)替代NCCL,支持更高效的多卡数据同步与梯度聚合。线性加速比验证:通过多卡并行训练验证HCCL的通信开销控制能力,确保加速比接近理论线性增长。 三、主要技术内容1:算子切换与接口替换背景:原模型使用 PyTorch + CUDA 进行开发,依赖大量 CUDA 接口及算子,如 `torch.cuda.*`、`nn.DataParallel()`、`torch.utils.checkpoint` 等。工作内容:使用手动迁移方式对关键算子进行逐个替换:- 将 device = torch.device("cuda") 替换为 device = torch.device("npu");- 替换所有与设备相关的函数调用,如 tensor.cuda() → tensor.npu();- 替换分布式通信接口:由 torch.distributed.init_process_group(backend="nccl") 改为 backend="hccl" ; 2. 对不支持自动映射的算子进行了定制化开发或寻找替代方案,例如:- 使用 Ascend 提供的 torch_npu 扩展库中的等价算子;-在昇腾社区寻找相关的算子进行替换;2:环境变量与运行配置设置环境变量:为了确保 PyTorch 能够正确识别并调度 NPU 设备,设置了以下环境变量:bashexport ASCEND_HOME=/usr/local/Ascend/latest export PATH=${ASCEND_HOME}/compiler/bin:${ASCEND_HOME}/tools/bin:${PATH} export PYTHONPATH=${ASCEND_HOME}/pyACL/python/site-packages/acl:$PYTHONPATH export LD_LIBRARY_PATH=${ASCEND_HOME}/lib64:${ASCEND_HOME}/runtime/lib64:${LD_LIBRARY_PATH}配置启动脚本:修改训练启动脚本以指定设备为 NPU:pythondevice = torch.device("npu") model.to(device)- 若使用多卡训练,则添加如下逻辑: bashtorch.distributed.init_process_group(backend='hccl', world_size=world_size, rank=rank)3:单卡NPU训练验证步骤:将原始训练脚本导入昇腾环境;替换所有 CUDA 接口为 NPU 接口;设置 batch size、学习率等超参数以适应 NPU 的显存限制;验证 loss 下降趋势、准确率、收敛速度是否正常;输出模型 checkpoint 并与 GPU 版本对比验证精度一致性。性能表现:指标GPU(V100)NPU(Ascend 910)单步训练耗时180ms205ms最终精度87.2%87.0%注:NPU 在首次执行时存在编译延迟,后续迭代时间可降至约 170ms。4:单机多卡 NPU 分布式训练架构调整:- 将模型封装为 `DistributedDataParallel` 模式:pythonmodel = torch.nn.parallel.DistributedDataParallel(model, device_ids=[local_rank])- 替换通信后端为 HCCL:bash torch.distributed.init_process_group(backend='hccl') 多卡资源配置:- 使用 torchrun 或自定义脚本启动多卡训练:bashtorchrun --nproc_per_node=8 train.py性能测试结果:卡数总训练时间(Epoch)加速比(相对单卡)11h20m1x242m1.9x423m3.5x813m6.1x> 注:加速比接近线性,说明通信开销控制良好,HCCL 表现优异。 四、问题排查与优化典型问题与解决方案:问题解决方案不支持的 CUDA 算子查阅 Ascend 官方文档,查找对应算子映射关系,或采用 `torch_npu` 扩展库多卡训练通信异常使用 `hccl_tools` 工具检测通信组建立情况,确认 rank 配置无误内存占用高减小 batch size,启用混合精度训练(`amp`),关闭冗余日志输出初始化失败或设备不可见检查环境变量、驱动版本、固件版本是否匹配性能调优建议:利用NPU的内存复用机制优化显存占用;通过Ascend Profiler工具定位训练瓶颈(如算子耗时、通信开销)。通用迁移方法论:建立CUDA-CANN算子映射表,加速接口替换;采用分阶段验证(单卡→多卡→性能调优),降低迁移风险。 五、成果总结成功将 PyTorch 模型从 GPU 平台完整迁移至 Ascend NPU 平台;实现了单卡 NPU 上的稳定训练流程;支持单机多卡分布式训练,具备良好的扩展性和稳定性;在功能一致性的前提下,性能指标基本达到预期;积累了 NPU 上 PyTorch 模型训练的实践经验,为后续大规模迁移打下基础。
-
从0到1解锁大模型应用开发:这3条学习路径让你玩转智能应用! [学习链接] cid:link_0别让AI浪潮把你拍在沙滩上!这份大模型学习秘籍请收好! 在这个AI应用如雨后春笋般涌现的时代,你是否也渴望成为站在技术前沿的弄潮儿?但面对大模型开发这座看似高不可攀的山峰,满心的热情却常常被 “不知从何学起” 的迷茫浇灭。别担心!由华为云学堂技术专家精心打磨的大模型学习路径,正是为你量身定制的 “通关指南”! 跟着这3条学习路径,您将经历: 大模型应用开发学习路径——从0到1拆解大模型核心技术 大模型应用开发学习路径就像一把万能钥匙,带你0基础入门,一步步拆解大模型应用开发全流程。想象一下,通过学习,你能亲手打造出服务千万用户的智能助手,或是为教育、医疗行业定制专属的个性化推荐系统,让大模型真正成为解决实际问题的得力工具,无论你是开发者还是AI爱好者,在这里都能掌握构建智能应用的硬核技能,让你的创意借助大模型落地开花! 从入门到精通五步法1、了解AI与大模型2、会用AI大模型3、会模型部署和集成4、会做大模型应用开发5、会模型微调 RAG开发学习路径——搭建你的第一个大模型应用 RAG开发学习路径,堪称攻克大模型 “幻觉”难题的利剑。系统学习向量数据库搭建、语义检索优化等核心技术,掌握知识动态融合策略,你将拥有打造精准智能问答、专业知识库系统的超能力。当企业因你的 RAG 技术,让知识管理变得高效有序,让智能客服更加 “聪明伶俐”,你就是推动行业变革的幕后英雄。 从入门到精通四步法1、初识RAG2、掌握RAG关键技术3、搭建一个RAG应用4、RAG性能优化 AI Agent 开发学习路径——掌握AI Agent开发的核心技巧 AI Agent 开发学习路径则充满了未来感,它将带你探秘智能体自主决策的神奇世界。学习任务规划、多智能体协作等前沿技术,亲手构建能自主思考的智能办公助手、自动化运营 Agent。 AI Agent开发与构建从原理到实战五阶梯1、初始AI Agent2、AI Agent常用架构3、深入了解AI Agent中的常用工具4、AI Agent集成工具优化技术:MCP5、搭建一个AI Agent应用 更令人惊喜的是,我们还准备了 “豪华学习大礼包”! 开发者空间——开发者专属的云上成长空间 汇聚预置华为根技术工具和资源,一站式服务使能开发者持续探索创新,配套超全大模型学习宝库!实验、课程、案例、华为云开发者认证一网打尽,还能获得权威认证证书,助力开发者能力快速进阶。 别再犹豫,别再观望!与其在AI浪潮中被动等待,不如主动出击,踏上大模型学习之路,解锁属于你的AI未来,成为下一个改变世界的技术达人!点击链接,开启这场充满惊喜与挑战的学习之旅吧!
-
当别人已在鸿蒙生态接单月入3万+,你还在Java内卷?当大模型重构IT岗位,传统运维正批量淘汰…体贴的深圳人社为您再次奉上信创技术/鸿蒙系统/麒麟系统等新一代信息应用技术生存技能大餐• 信创智算与大模型技术课程• 开源高斯数据库技术课程• 鸿蒙原生应用开发课程• 开源鸿蒙设备应用技术课程• 前沿 科技 国产系统应用 还等什么基于麒麟操作系统的信创基础软件适配迁移与运维课程 【理论授课、现场实战、组队攻坚】告别枯燥的理论推砌聘请行业专家担任讲师5门免费课程助你抢占信息技术高地,满级晋升!最重要的是全!免!费!咱不花一分钱就开启成长与蜕变的大门报名有啥要求?咋报名?快随我往下看吧!一、 报名条件报名学员需具有新一代电子信息应用技术相关的行业从业背景或具备相关专业背景,并满足以下条件之一即可:1.本市户籍人员;2.在本市正常缴交社会保险的人员;3.深圳市登记失业人员;4.在深圳市公共就业服务机构进行离校两年未就业实名登记的本市高校毕业生;5.本市高校或本市户籍在市外高校的毕业年度(指毕业时间在2025年1月1日至12月31日之间)毕业生(含技师学院高级工班、预备技师班)。温馨提示:(1)同一劳动者同一年度只能参加1次项目制培训哦。如果您已经参加2024年度项目制培训但未完成规定学时50%以上,很遗憾,那无法参加2025年度的项目制培训了哦。(2)同一劳动者同一年度内企业新型学徒制培训,学生学徒制培训、技培生学徒制培训只能参加一次,且均不能和项目制培训同时享受。等等先别急还没完!还有额外补贴!【额外补贴】如您满足以下条件之一:• 本市就业困难人员• 本市零就业家庭成员• 本市就业残疾人• 本市城市低保家庭成员• 本市毕业2年内的“两后生”中的农村学员• 本市求职就业的脱贫人员不仅可以免费学课程还可以再领500元的生活补贴金💴! 接下来咱们看一下具体学习内容吧!二、 学习内容以及报名方式 新一代电子信息应用技术项目制培训 指导单位:深圳市人力资源和社会保障局主办单位 :深圳市职业技能培训指导中心承办单位:深圳职业技术大学1、信创智算与大模型技术课程课程内容培训天数培训名额主要内容包括:1.基于昇腾平台的 DeepSeek 模型搭建与优化2.华为云昇腾算力支持下的私有场景大模型部署3.基于昇腾与 DeepSeek 的私有大模型自主训练4.电商场景下大模型的创新应用与拓展6天约300人 信创智算与大模型技术课程报名二维码及交流群(QQ群) 2、开源高斯数据库技术课程课程内容培训天数培训名额主要内容包括:1.高斯数据库安装与对象管理实操2.场景化高斯数据库实验探索3.数据库AI策略与技巧4.数据安全管理与防护6天约250人开源高斯数据库技术课程报名二维码及交流群(QQ群) 3、 鸿蒙原生应用开发课程课程内容培训天数培训名额主要内容包括:1.基于ArkTSUI框架搭建实训云平台2.鸿蒙原生办公签到系统开发3.基于Next版本开发实时社交软件联动DeepSeek实现聊天问答4.基于鸿蒙服务卡片开发音乐推荐软件5.鸿蒙原生健康服务检测软件开发6天约250人 鸿蒙原生应用开发课程报名二维码及交流群(QQ群) 4、 开源鸿蒙设备应用技术课程课程内容培训天数培训名额主要内容包括:1.OpenHarmony搭建与配置2.开源鸿蒙设备驱动开发和集成3.基于开源鸿蒙的HAL层开发4.基于开源鸿蒙的智能家居软硬件开发6天约250人 开源鸿蒙设备应用技术课程报名二维码及交流群(QQ群) 5、基于麒麟操作系统的信创基础软件适配迁移与运维课程课程内容培训天数培训名额主要内容包括:1.银河麒麟桌面操作系统 V10 的管理应用2.掌握适配测试基础及软硬件适配测试技能6天约250人 基于麒麟操作系统的信创基础软件适配迁移与运维课程报名二维码及交流群(QQ群) 三、成长与收获1.掌握实用技能,提高自身职业技能,增强就业竞争力,优化职业发展路径;2.培训考勤达标且考核通过将获得《深圳市职业技能提升培训合格证书》;3.可自行选考行业权威认证:HCIA-AI认证、HCIA-openGauss认证、HarmonyOS应用开发者认证、开放原子OpenHarmony人才认证、KYCA、KYCP认证(不含考证费)。四、班级设置l 7月29日-9月30日 开设日常精品班(周一到周六)开设周末精品班(周六或周日单日班)l 7月29日-10月20日开设周末精品班(周六或周日单日班)五、咨询与联络黄老师:13528095312(微信同号)周老师:0755-26019607咨询时间:工作日9:00- 18:00其他时间咨询联系QQ群工作人员六、培训地点深圳职业技术大学西丽湖园区(信息楼)建议绿色出行:深圳地铁5号线西丽地铁站F口步行800米。公交车站-深圳职业技术大学(西丽湖园区),线路包括M197、M182、M176、M492、高峰专线59、325、M535、M217、67、326等。 别再犹豫,抓住这个难得的机会,让自己在公益性培训中实现华丽转身!立即报名,开启你的成长之旅!
-
团队名称格式得分精度得分西北智联161153CEATRG0149.5hid_t7v_sdh548dhkkz0148.5hid_cwfo0xxj8regp6r0146鸿蒙极客队162143.48BUPT-ParCIS160136.34notrickno154136.08武汉船院计算机2307#2153133.49ECNU_ELRM139132.15全都对队162131.56judgeyang98131myf gogogo98129.11二进制萝卜培育中心98127擎狮0126纳算力克大工坊105125.45不玊之客劣等兵朴昌罗桂夏0123奇点0122.73点子王0122.45破晓者155120.4想去研究大模型108119.9bupt735162117.47Decoder-Only161116123462115[object Object]0114.88浙安院云计算63112.54挑战杯揭榜挂帅华为0111.75说人话0111.11昇腾推理智速引擎0109.08试试0107.1蒜鸟你搞不赢队0106.55ken0102.35hid_b2ydyl88e3z7tqc160102.02智在必得098.79yangs_wdxw12096.88TEMP12096.88这对吗096.03拳头花可火091.64西北文科大学队13891武汉船院计算机2307090.71昇腾芯链088.33hid_77fv2kg9-fgvjfg087.81三角矩阵085.99随便起一个13585.08永宁永胜9883.93123--083.79PACKPACK11378.32华东理工大学AIMC实验室074.49昇腾智推大模型072.53CodeWisdom070.54GT-ejdkd068.7Create3267扬花落满肩064.14马桶蹲累了160.97急急急060.11AAA建材王哥058璃月医科大学孤云阁校区16257.22lab3083851.42fengerhu14946被资本做局041.24智枢拓界025.66hid_x7qejp3bft91lsd16012.02hid_hyh--co6xfj6nhk010challenger X07A-team02hw035532519500.4CCD队00.4没机基队00
-
我使用310P搭建的图片推理服务,yolov5l模型推理性能为100fps,我想问一下这个性能是这张卡的上限了吗?
-
截止7月16日14:30前,各团队初赛A榜注:仅最高精度分对应的格式得分,如最高精度得分相同,以最早提交的最高精度得分为准。团队名称格式得分精度得分西北智联161153hid_t7v_sdh548dhkkz0148.5hid_cwfo0xxj8regp6r0146鸿蒙极客队162143.48BUPT-ParCIS160136.34CEATRG0135.5武汉船院计算机2307#2153133.49ECNU_ELRM139132.15全都对队162131.56judgeyang98131myf gogogo98129.11二进制萝卜培育中心98127擎狮0126纳算力克大工坊105125.45不玊之客劣等兵朴昌罗桂夏0123奇点0122.73点子王0122.45破晓者155120.4bupt735162117.47Decoder-Only161116123462115浙安院云计算63112.54试试0107.1ken0102.35hid_b2ydyl88e3z7tqc160102.02蒜鸟你搞不赢队0101.19智在必得098.79说人话098.77这对吗096.03TEMP095.17拳头花可火091.64武汉船院计算机2307090.71挑战杯揭榜挂帅华为088.8hid_77fv2kg9-fgvjfg087.81随便起一个13585.08昇腾芯链084.74yangs_wdxw082.65三角矩阵081.22PACKPACK11378.32永宁永胜075.39华东理工大学AIMC实验室074.49昇腾智推大模型072.53CodeWisdom070.54GT-ejdkd068.7急急急060.11璃月医科大学孤云阁校区16257.22想去研究大模型056.65马桶蹲累了056.05fengerhu14946扬花落满肩118.43Create017.11hid_x7qejp3bft91lsd16012.02智枢拓界011.59hid_hyh--co6xfj6nhk010challenger X07A-team02hw035532519500.4CCD队00.4没机基队00
-
截止7月9日14点00分前,各团队初赛A榜注:仅最高精度分对应的格式得分,如最高精度得分相同,以最早提交的最高精度得分为准。团队名称格式得分精度得分hid_t7v_sdh548dhkkz0148.5hid_cwfo0xxj8regp6r0146鸿蒙极客队162143.48CEATRG0135.5武汉船院计算机2307#2153133.49ECNU_ELRM144131.34擎狮0126纳算力克大工坊105125.45奇点0122.73点子王0122.45破晓者155120.4bupt735162117.47西北智联133117Decoder-Only161116123462115judgeyang0114.07浙安院云计算63112.54二进制萝卜培育中心0112.06BUPT-ParCIS76108.52myf gogogo0107.23试试0107.1不玊之客劣等兵朴昌罗桂夏0106.4ken0102.35hid_b2ydyl88e3z7tqc160102.02智在必得098.79这对吗096.03蒜鸟你搞不赢队092.13拳头花可火091.64武汉船院计算机2307090.71说人话089.05hid_77fv2kg9-fgvjfg087.81TEMP085.96yangs_wdxw082.65昇腾芯链082.39永宁永胜075.39华东理工大学AIMC实验室074.49挑战杯揭榜挂帅华为073.73CodeWisdom070.54GT-ejdkd068.7急急急060.11璃月医科大学孤云阁校区16257.22想去研究大模型056.65全都对队4846fengerhu14946扬花落满肩118.43Create017.11hid_x7qejp3bft91lsd16012.02hid_hyh--co6xfj6nhk010智枢拓界09.56challenger X07A-team02hw035532519500.4没机基队00
-
系统学习人工智能,掌握硬核AI技术,通关秘籍来啦!!!昇腾AI专区精心打造的「6步进阶式学习路径」正式上线,为学习者提供了一条全面且系统的学习路径,帮助学习者掌握关键技术,推动AI创新与应用。无论你是初学者还是经验丰富的开发者,都能在这条学习路径中找到适合自己的学习方向。助你从入门到精通,清晰路径,步步为营,稳扎稳打攀登AI高峰!一、人工智能:AI世界的万能钥匙「Python」Python作为AI领域的“通用语言”,是AI学习中效率最高的工具之一,它能让你把精力聚焦在AI算法和业务逻辑上,而不是被复杂的编程语法束缚。无论是入门机器学习还是开发复杂的大模型应用,Python都是不可或缺的基础技能。第一步从动手学Python开始,从基础语法到高级编程技巧,再到实际项目实践,掌握Python核心知识与应用能力。① 动手学Python② python数据处理③ 实用AI库④ 动手学机器学习⑤ 动手学深度学习二、昇腾基础入门:走进昇腾AI进入昇腾基础入门阶段,学习者将正式踏入昇腾AI的领域。这一板块主要介绍昇腾AI硬件的全栈技术体系,深度剖析昇腾架构核心模块,解读910/310等AI处理器的性能特性与场景适配策略,并实战演练Atlas全栈产品的部署方案,为昇腾AI解决方案开发提供芯片级调优能力。以及CANN(Compute Architecture for Neural Networks)异构计算架构的基础知识。学习者将掌握基于AscendCL的高性能应用开发技能,深入理解GE图引擎优化技术,并熟练运用ATC、AOE、AMCT等核心…系统培养昇腾AI处理器的高性能算子开发能力。① 昇腾硬件② 异构计算架构CANN③ 昇腾算子开发三、昇腾模型开发:掌握模型构建核心昇腾模型开发板块深入探讨基于昇腾平台的模型开发技术。将系统培养基于昇腾平台的PyTorch全栈开发能力,覆盖环境配置、模型迁移、性能优化及精度调优全链路实战。提供昇腾平台无缝迁移全栈方案,深度融合MindSpore框架与昇腾硬件特性。① 昇腾PyTorch开发② 昇腾模型开发工具链③ 昇腾PyTorch三方库④ 昇腾PyTorch经典任务实践⑤ 昇腾MindSpore开发⑥ 昇腾MindSpore迁移开发四、大模型开发:探索昇腾大模型框架在大模型开发板块,学习者将深入了解大模型的原理、架构和训练方法、高效推理能力。学习昇腾平台大模型训练核心技术,聚焦MindSpeed框架的并行架构革新、计算加速优化与内存极致压缩三大维度。① 昇腾大模型训练框架② 昇腾大模型推理框架五、应用开发:实现AI技术落地应用开发是将AI技术转化为实际价值的关键环节。在这一板块,学习者将学习构建大语言模型的高效交互能力,从提示工程基础到进阶应用,通过精准提示词设计掌握模型行为控制、任务分解与多轮对话编排技术,实现大模型输出质量与可靠性的双重提升。聚焦LangChain框架的核心技术实践,实现复杂任务的自动化分解与执行编排。详细介绍Pangu-Pro-MoE、DeepSeek等经典大模型在提示工程、代码编写、信息抽取以及Agent构建等方面的应用,LlamaIndex框架的智能应用开发能力。① 提示工程② LangChain大模型应用开发③ 经典大模型应用开发④ LlamaIndex大模型应六、AI4S:拓展AI应用新领域AI4S(AI for Science)是AI技术在科学研究领域的应用,为科学研究带来了新的方法和思路。在AI4S板块,学习者将深度融合人工智能与生命科学前沿技术,聚焦生物序列智能建模,为精准医疗与生物工程提供数据驱动的科研新范式。并聚焦AI技术在地球科学领域的应用与前沿发展,培养学生利用机器学习、深度学习等方法解决地球系统问题的能力。通过这一板块的学习,学习者能够拓展AI应用的视野,为未来在科学研究领域的创新提供技术支持。① AI与生命科学② AI与数学/物理求解③ AI与地球科学此外,昇腾AI专区学习路径还涵盖了经典大模型应用开发、使用AI库、动手学机器学习等热门学习路径。通过实际操作和案例分析,学习者能够更加深入的理解和掌握相关技术。昇腾AI专区学习路径为学习者提供了一个全面、系统、循序渐进的学习框架,帮助学习者逐步掌握昇腾关键技术,推动AI创新与应用。无论你是希望在AI领域开启职业生涯,还是寻求技术突破和创新,都能在这条学习路径中找到属于自己的成长之路。快来加入昇腾AI专区的学习之旅,点这里>>cid:link_0一起探索AI世界的无限可能!
推荐直播
-
码道新技能,AI 新生产力——从自动视频生成到开源项目解析2026/04/08 周三 19:00-21:00
童得力-华为云开发者生态运营总监/何文强-无人机企业AI提效负责人
本次华为云码道 Skill 实战活动,聚焦两大 AI 开发场景:通过实战教学,带你打造 AI 编程自动生成视频 Skill,并实现对 GitHub 热门开源项目的智能知识抽取,手把手掌握 Skill 开发全流程,用 AI 提升研发效率与内容生产力。
回顾中 -
华为云码道:零代码股票智能决策平台全功能实战2026/04/18 周六 10:00-12:00
秦拳德-中软国际教育卓越研究院研究员、华为云金牌讲师、云原生技术专家
利用Tushare接口获取实时行情数据,采用Transformer算法进行时序预测与涨跌分析,并集成DeepSeek API提供智能解读。同时,项目深度结合华为云CodeArts(码道)的代码智能体能力,实现代码一键推送至云端代码仓库,建立起高效、可协作的团队开发新范式。开发者可快速上手,从零打造功能完整的个股筛选、智能分析与风险管控产品。
回顾中 -
华为云码道全新升级,多会话并行与多智能体协作2026/05/08 周五 19:00-21:00
王一男-华为云码道产品专家;张嘉冉-华为云码道工程师;胡琦-华为云HCDE;程诗杰-华为云HCDG
华为云码道4月份版本全新升级,此次直播深度解读4月份产品特性,通过“特性解读+实操演示+实战案例+设计创新”的组合,全方位展现码道在多会话并行与多智能体协作方面的能力,赋能开发者提升效率
正在直播
热门标签