• [技术干货] UIE模型实战实体抽取任务【打车数据、快递单】
    项目连接:可以直接fork使用 Paddlenlp之UIE模型实战实体抽取任务【打车数据、快递单】0.背景介绍本项目将演示如何通过小样本样本进行模型微调,快速且准确抽取快递单中的目的地、出发地、时间、打车费用等内容,形成结构化信息。辅助物流行业从业者进行有效信息的提取,从而降低客户填单的成本。数据集情况: waybill.jsonl文件是快递单信息数据集:{"id": 57, "text": "昌胜远黑龙江省哈尔滨市南岗区宽桥街28号18618391296", "relations": [], "entities": [{"id": 111, "start_offset": 0, "end_offset": 3, "label": "姓名"}, {"id": 112, "start_offset": 3, "end_offset": 7, "label": "省份"}, {"id": 113, "start_offset": 7, "end_offset": 11, "label": "城市"}, {"id": 114, "start_offset": 11, "end_offset": 14, "label": "县区"}, {"id": 115, "start_offset": 14, "end_offset": 20, "label": "详细地址"}, {"id": 116, "start_offset": 20, "end_offset": 31, "label": "电话"}]} {"id": 58, "text": "易颖18500308469山东省烟台市莱阳市富水南路1号", "relations": [], "entities": [{"id": 118, "start_offset": 0, "end_offset": 2, "label": "姓名"}, {"id": 119, "start_offset": 2, "end_offset": 13, "label": "电话"}, {"id": 120, "start_offset": 13, "end_offset": 16, "label": "省份"}, {"id": 121, "start_offset": 16, "end_offset": 19, "label": "城市"}, {"id": 122, "start_offset": 19, "end_offset": 22, "label": "县区"}, {"id": 123, "start_offset": 22, "end_offset": 28, "label": "详细地址"}]}doccano_ext.jsonl是打车数据集:{"id": 1, "text": "昨天晚上十点加班打车回家58元", "relations": [], "entities": [{"id": 0, "start_offset": 0, "end_offset": 6, "label": "时间"}, {"id": 1, "start_offset": 11, "end_offset": 12, "label": "目的地"}, {"id": 2, "start_offset": 12, "end_offset": 14, "label": "费用"}]} {"id": 2, "text": "三月三号早上12点46加班,到公司54", "relations": [], "entities": [{"id": 3, "start_offset": 0, "end_offset": 11, "label": "时间"}, {"id": 4, "start_offset": 15, "end_offset": 17, "label": "目的地"}, {"id": 5, "start_offset": 17, "end_offset": 19, "label": "费用"}]} {"id": 3, "text": "8月31号十一点零四工作加班五十块钱", "relations": [], "entities": [{"id": 6, "start_offset": 0, "end_offset": 10, "label": "时间"}, {"id": 7, "start_offset": 14, "end_offset": 16, "label": "费用"}]} {"id": 4, "text": "5月17号晚上10点35分加班打车回家,36块五", "relations": [], "entities": [{"id": 8, "start_offset": 0, "end_offset": 13, "label": "时间"}, {"id": 1, "start_offset": 18, "end_offset": 19, "label": "目的地"}, {"id": 9, "start_offset": 20, "end_offset": 24, "label": "费用"}]} {"id": 5, "text": "2009年1月份通讯费一百元", "relations": [], "entities": [{"id": 10, "start_offset": 0, "end_offset": 7, "label": "时间"}, {"id": 11, "start_offset": 11, "end_offset": 13, "label": "费用"}]}结果展示预览输入:城市内交通费7月5日金额114广州至佛山 从百度大厦到龙泽苑东区打车费二十元 上海虹桥高铁到杭州时间是9月24日费用是73元 上周末坐动车从北京到上海花费五十块五毛 昨天北京飞上海话费一百元输出:{"出发地": [{"text": "广州", "start": 15, "end": 17, "probability": 0.9073772252165782}], "目的地": [{"text": "佛山", "start": 18, "end": 20, "probability": 0.9927365183877761}], "时间": [{"text": "7月5日", "start": 6, "end": 10, "probability": 0.9978010396512218}]} {"出发地": [{"text": "百度大厦", "start": 1, "end": 5, "probability": 0.968825147409472}], "目的地": [{"text": "龙泽苑东区", "start": 6, "end": 11, "probability": 0.9877913072493669}]} {"目的地": [{"text": "杭州", "start": 7, "end": 9, "probability": 0.9929172180094881}], "时间": [{"text": "9月24日", "start": 12, "end": 17, "probability": 0.9953342057701597}]} {#"出发地": [{"text": "北京", "start": 7, "end": 9, "probability": 0.973048366717471}], "目的地": [{"text": "上海", "start": 10, "end": 12, "probability": 0.988486130309397}], "时间": [{"text": "上周末", "start": 0, "end": 3, "probability": 0.9977407699595275}]} {"出发地": [{"text": "北京", "start": 2, "end": 4, "probability": 0.974188953533556}], "目的地": [{"text": "上海", "start": 5, "end": 7, "probability": 0.9928200521486445}], "时间": [{"text": "昨天", "start": 0, "end": 2, "probability": 0.9731559534465504}]}1.数据集加载(快递单数据、打车数据)doccano_file: 从doccano导出的数据标注文件。save_dir: 训练数据的保存目录,默认存储在data目录下。negative_ratio: 最大负例比例,该参数只对抽取类型任务有效,适当构造负例可提升模型效果。负例数量和实际的标签数量有关,最大负例数量 = negative_ratio * 正例数量。该参数只对训练集有效,默认为5。为了保证评估指标的准确性,验证集和测试集默认构造全负例。splits: 划分数据集时训练集、验证集所占的比例。默认为[0.8, 0.1, 0.1]表示按照8:1:1的比例将数据划分为训练集、验证集和测试集。task_type: 选择任务类型,可选有抽取和分类两种类型的任务。options: 指定分类任务的类别标签,该参数只对分类类型任务有效。默认为["正向", "负向"]。prompt_prefix: 声明分类任务的prompt前缀信息,该参数只对分类类型任务有效。默认为"情感倾向"。is_shuffle: 是否对数据集进行随机打散,默认为True。seed: 随机种子,默认为1000.*separator: 实体类别/评价维度与分类标签的分隔符,该参数只对实体/评价维度级分类任务有效。默认为"##"。!python doccano.py \ --doccano_file ./data/doccano_ext.jsonl \ --task_type 'ext' \ --save_dir ./data \ --splits 0.8 0.1 0.1 \ --negative_ratio 5[2022-07-14 11:34:26,474] [ INFO] - Converting doccano data... 100%|████████████████████████████████████████| 40/40 [00:00<00:00, 42560.16it/s] [2022-07-14 11:34:26,477] [ INFO] - Adding negative samples for first stage prompt... 100%|███████████████████████████████████████| 40/40 [00:00<00:00, 161009.75it/s] [2022-07-14 11:34:26,478] [ INFO] - Converting doccano data... 100%|██████████████████████████████████████████| 5/5 [00:00<00:00, 21754.69it/s] [2022-07-14 11:34:26,479] [ INFO] - Adding negative samples for first stage prompt... 100%|██████████████████████████████████████████| 5/5 [00:00<00:00, 44057.82it/s] [2022-07-14 11:34:26,479] [ INFO] - Converting doccano data... 100%|██████████████████████████████████████████| 5/5 [00:00<00:00, 26181.67it/s] [2022-07-14 11:34:26,480] [ INFO] - Adding negative samples for first stage prompt... 100%|██████████████████████████████████████████| 5/5 [00:00<00:00, 45689.59it/s] [2022-07-14 11:34:26,482] [ INFO] - Save 160 examples to ./data/train.txt. [2022-07-14 11:34:26,482] [ INFO] - Save 20 examples to ./data/dev.txt. [2022-07-14 11:34:26,482] [ INFO] - Save 20 examples to ./data/test.txt. [2022-07-14 11:34:26,482] [ INFO] - Finished! It takes 0.01 seconds输出部分展示:{"content": "上海到北京机票1320元", "result_list": [{"text": "上海", "start": 0, "end": 2}], "prompt": "出发地"} {"content": "上海到北京机票1320元", "result_list": [{"text": "北京", "start": 3, "end": 5}], "prompt": "目的地"} {"content": "上海到北京机票1320元", "result_list": [{"text": "1320", "start": 7, "end": 11}], "prompt": "费用"} {"content": "上海虹桥到杭州东站高铁g7555共73元时间是10月14日", "result_list": [{"text": "上海虹桥", "start": 0, "end": 4}], "prompt": "出发地"} {"content": "上海虹桥到杭州东站高铁g7555共73元时间是10月14日", "result_list": [{"text": "杭州东站", "start": 5, "end": 9}], "prompt": "目的地"} {"content": "上海虹桥到杭州东站高铁g7555共73元时间是10月14日", "result_list": [{"text": "73", "start": 17, "end": 19}], "prompt": "费用"} {"content": "上海虹桥到杭州东站高铁g7555共73元时间是10月14日", "result_list": [{"text": "10月14日", "start": 23, "end": 29}], "prompt": "时间"} {"content": "昨天晚上十点加班打车回家58元", "result_list": [{"text": "昨天晚上十点", "start": 0, "end": 6}], "prompt": "时间"} {"content": "昨天晚上十点加班打车回家58元", "result_list": [{"text": "家", "start": 11, "end": 12}], "prompt": "目的地"} {"content": "昨天晚上十点加班打车回家58元", "result_list": [{"text": "58", "start": 12, "end": 14}], "prompt": "费用"} {"content": "2月20号从南山到光明二十元", "result_list": [{"text": "2月20号", "start": 0, "end": 5}], "prompt": "时间"}2.模型训练!python finetune.py \ --train_path "./data/train.txt" \ --dev_path "./data/dev.txt" \ --save_dir "./checkpoint" \ --learning_rate 1e-5 \ --batch_size 8 \ --max_seq_len 512 \ --num_epochs 100 \ --model "uie-base" \ --seed 1000 \ --logging_steps 10 \ --valid_steps 50 \ --device "gpu"部分训练效果展示:**具体输出已折叠** [2022-07-12 15:09:47,643] [ INFO] - global step 250, epoch: 13, loss: 0.00045, speed: 3.90 step/s [2022-07-12 15:09:47,910] [ INFO] - Evaluation precision: 1.00000, recall: 1.00000, F1: 1.00000 [2022-07-12 15:09:50,399] [ INFO] - global step 260, epoch: 13, loss: 0.00043, speed: 4.02 step/s [2022-07-12 15:09:52,966] [ INFO] - global step 270, epoch: 14, loss: 0.00042, speed: 3.90 step/s [2022-07-12 15:09:55,464] [ INFO] - global step 280, epoch: 14, loss: 0.00040, speed: 4.00 step/s [2022-07-12 15:09:58,028] [ INFO] - global step 290, epoch: 15, loss: 0.00039, speed: 3.90 step/s [2022-07-12 15:10:00,516] [ INFO] - global step 300, epoch: 15, loss: 0.00038, speed: 4.02 step/s [2022-07-12 15:10:00,781] [ INFO] - Evaluation precision: 1.00000, recall: 1.00000, F1: 1.00000 [2022-07-12 15:10:03,348] [ INFO] - global step 310, epoch: 16, loss: 0.00036, speed: 3.90 step/s [2022-07-12 15:10:05,836] [ INFO] - global step 320, epoch: 16, loss: 0.00035, speed: 4.02 step/s [2022-07-12 15:10:08,393] [ INFO] - global step 330, epoch: 17, loss: 0.00034, speed: 3.91 step/s [2022-07-12 15:10:10,888] [ INFO] - global step 340, epoch: 17, loss: 0.00033, speed: 4.01 step/s 推荐使用GPU环境,否则可能会内存溢出。CPU环境下,可以修改model为uie-tiny,适当调下batch_size。 增加准确率的话:--num_epochs 设置大点多训练训练 可配置参数说明: **train_path:** 训练集文件路径。 **dev_path:** 验证集文件路径。 **save_dir:** 模型存储路径,默认为./checkpoint。 **learning_rate:** 学习率,默认为1e-5。 **batch_size:** 批处理大小,请结合显存情况进行调整,若出现显存不足,请适当调低这一参数,默认为16。 **max_seq_len:** 文本最大切分长度,输入超过最大长度时会对输入文本进行自动切分,默认为512。 **num_epochs:** 训练轮数,默认为100。 **model** 选择模型,程序会基于选择的模型进行模型微调,可选有uie-base和uie-tiny,默认为uie-base。 **seed:** 随机种子,默认为1000. **logging_steps:** 日志打印的间隔steps数,默认10。 **valid_steps:** evaluate的间隔steps数,默认100。 **device:** 选用什么设备进行训练,可选cpu或gpu。3模型评估!python evaluate.py \ --model_path ./checkpoint/model_best \ --test_path ./data/test.txt \ --batch_size 16 \ --max_seq_len 512[2022-07-11 13:41:23,831] [ INFO] - ----------------------------- [2022-07-11 13:41:23,831] [ INFO] - Class Name: all_classes [2022-07-11 13:41:23,832] [ INFO] - Evaluation Precision: 1.00000 | Recall: 1.00000 | F1: 1.00000 [2022-07-11 13:41:35,024] [ INFO] - ----------------------------- [2022-07-11 13:41:35,024] [ INFO] - Class Name: 出发地 [2022-07-11 13:41:35,024] [ INFO] - Evaluation Precision: 1.00000 | Recall: 1.00000 | F1: 1.00000 [2022-07-11 13:41:35,139] [ INFO] - ----------------------------- [2022-07-11 13:41:35,139] [ INFO] - Class Name: 目的地 [2022-07-11 13:41:35,139] [ INFO] - Evaluation Precision: 1.00000 | Recall: 1.00000 | F1: 1.00000 [2022-07-11 13:41:35,246] [ INFO] - ----------------------------- [2022-07-11 13:41:35,246] [ INFO] - Class Name: 费用 [2022-07-11 13:41:35,246] [ INFO] - Evaluation Precision: 1.00000 | Recall: 1.00000 | F1: 1.00000 [2022-07-11 13:41:35,313] [ INFO] - ----------------------------- [2022-07-11 13:41:35,313] [ INFO] - Class Name: 时间 [2022-07-11 13:41:35,313] [ INFO] - Evaluation Precision: 1.00000 | Recall: 1.00000 | F1: 1.00000model_path: 进行评估的模型文件夹路径,路径下需包含模型权重文件model_state.pdparams及配置文件model_config.json。test_path: 进行评估的测试集文件。batch_size: 批处理大小,请结合机器情况进行调整,默认为16。max_seq_len: 文本最大切分长度,输入超过最大长度时会对输入文本进行自动切分,默认为512。model: 选择所使用的模型,可选有uie-base, uie-medium, uie-mini, uie-micro和uie-nano,默认为uie-base。debug: 是否开启debug模式对每个正例类别分别进行评估,该模式仅用于模型调试,默认关闭。4 结果预测from pprint import pprint import json from paddlenlp import Taskflow def openreadtxt(file_name): data = [] file = open(file_name,'r',encoding='UTF-8') #打开文件 file_data = file.readlines() #读取所有行 for row in file_data: data.append(row) #将每行数据插入data中 return data data_input=openreadtxt('./input/nlp.txt') schema = ['出发地', '目的地','时间'] few_ie = Taskflow('information_extraction', schema=schema, batch_size=1,task_path='./checkpoint/model_best') results=few_ie(data_input) with open("./output/test.txt", "w+",encoding='UTF-8') as f: #a : 写入文件,若文件不存在则会先创建再写入,但不会覆盖原文件,而是追加在文件末尾 for result in results: line = json.dumps(result, ensure_ascii=False) #对中文默认使用的ascii编码.想输出真正的中文需要指定ensure_ascii=False f.write(line + "\n") print("数据结果已导出")输入文件展示:城市内交通费7月5日金额114广州至佛山 从百度大厦到龙泽苑东区打车费二十元 上海虹桥高铁到杭州时间是9月24日费用是73元 上周末坐动车从北京到上海花费五十块五毛 昨天北京飞上海话费一百元输出展示:{"出发地": [{"text": "广州", "start": 15, "end": 17, "probability": 0.9073772252165782}], "目的地": [{"text": "佛山", "start": 18, "end": 20, "probability": 0.9927365183877761}], "时间": [{"text": "7月5日", "start": 6, "end": 10, "probability": 0.9978010396512218}]} {"出发地": [{"text": "百度大厦", "start": 1, "end": 5, "probability": 0.968825147409472}], "目的地": [{"text": "龙泽苑东区", "start": 6, "end": 11, "probability": 0.9877913072493669}]} {"目的地": [{"text": "杭州", "start": 7, "end": 9, "probability": 0.9929172180094881}], "时间": [{"text": "9月24日", "start": 12, "end": 17, "probability": 0.9953342057701597}]} {"出发地": [{"text": "北京", "start": 7, "end": 9, "probability": 0.973048366717471}], "目的地": [{"text": "上海", "start": 10, "end": 12, "probability": 0.988486130309397}], "时间": [{"text": "上周末", "start": 0, "end": 3, "probability": 0.9977407699595275}]} {"出发地": [{"text": "北京", "start": 2, "end": 4, "probability": 0.974188953533556}], "目的地": [{"text": "上海", "start": 5, "end": 7, "probability": 0.9928200521486445}], "时间": [{"text": "昨天", "start": 0, "end": 2, "probability": 0.9731559534465504}]}5.可视化显示visualDL详细文档可以参考: cid:link_1 有详细讲解,具体实现参考代码,核心是:添加一个初始化记录器下面是结果展示:6.小技巧:获取paddle开源数据集数据集网站:cid:link_0数据集名称 简介 调用方法CoLA 单句分类任务,二分类,判断句子是否合法 paddlenlp.datasets.load_dataset('glue','cola')SST-2 单句分类任务,二分类,判断句子情感极性paddlenlp.datasets.load_dataset('glue','sst-2')MRPC 句对匹配任务,二分类,判断句子对是否是相同意思 paddlenlp.datasets.load_dataset('glue','mrpc')STSB 计算句子对相似性,分数为1~5 paddlenlp.datasets.load_dataset('glue','sts-b') QQP 判定句子对是否等效,等效、不等效两种情况,二分类任务 paddlenlp.datasets.load_dataset('glue','qqp')MNLI 句子对,一个前提,一个是假设。前提和假设的关系有三种情况:蕴含(entailment),矛盾(contradiction),中立(neutral)。句子对三分类问题 paddlenlp.datasets.load_dataset('glue','mnli')QNLI 判断问题(question)和句子(sentence)是否蕴含,蕴含和不蕴含,二分类 paddlenlp.datasets.load_dataset('glue','qnli')RTE 判断句对是否蕴含,句子1和句子2是否互为蕴含,二分类任务 paddlenlp.datasets.load_dataset('glue','rte')WNLI 判断句子对是否相关,相关或不相关,二分类任务 paddlenlp.datasets.load_dataset('glue','wnli')LCQMC A Large-scale Chinese Question Matching Corpus 语义匹配数据集 paddlenlp.datasets.load_dataset('lcqmc')通过paddlenlp提供的api调用,可以很方便实现数据加载,当然你想要把数据下载到本地,可以参考我下面的输出就可以保存数据了。#加载中文评论情感分析语料数据集ChnSentiCorp from paddlenlp.datasets import load_dataset train_ds, dev_ds, test_ds = load_dataset("chnsenticorp", splits=["train", "dev", "test"]) with open("./output/test2.txt", "w+",encoding='UTF-8') as f: #a : 写入文件,若文件不存在则会先创建再写入,但不会覆盖原文件,而是追加在文件末尾 for result in test_ds: line = json.dumps(result, ensure_ascii=False) #对中文默认使用的ascii编码.想输出真正的中文需要指定ensure_ascii=False f.write(line + "\n")7 总结UIE(Universal Information Extraction):Yaojie Lu等人在ACL-2022中提出了通用信息抽取统一框架UIE。该框架实现了实体抽取、关系抽取、事件抽取、情感分析等任务的统一建模,并使得不同任务间具备良好的迁移和泛化能力。PaddleNLP借鉴该论文的方法,基于ERNIE 3.0知识增强预训练模型,训练并开源了首个中文通用信息抽取模型UIE。该模型可以支持不限定行业领域和抽取目标的关键信息抽取,实现零样本快速冷启动,并具备优秀的小样本微调能力,快速适配特定的抽取目标。UIE的优势使用简单: 用户可以使用自然语言自定义抽取目标,无需训练即可统一抽取输入文本中的对应信息。实现开箱即用,并满足各类信息抽取需求。降本增效: 以往的信息抽取技术需要大量标注数据才能保证信息抽取的效果,为了提高开发过程中的开发效率,减少不必要的重复工作时间,开放域信息抽取可以实现零样本(zero-shot)或者少样本(few-shot)抽取,大幅度降低标注数据依赖,在降低成本的同时,还提升了效果。效果领先: 开放域信息抽取在多种场景,多种任务上,均有不俗的表现。本人本次主要通过实体抽取这个案例分享给大家,主要对开源的paddlenlp的案例进行了细化,比如在结果可视化方面以及结果输入输出的增加,使demo项目更佳完善。当然标注问题是所有问题的痛点,可以参考我的博客来解决这个问题本人博客:cid:link_3
  • [技术干货] 物体检测-Faster R-CNN
    物体检测-Faster R-CNN物体检测是计算机视觉中的一个重要的研究领域,在人流检测,行人跟踪,自动驾驶,医学影像等领域有着广泛的应用。不同于简单的图像分类,物体检测旨在对图像中的目标进行精确识别,包括物体的位置和分类,因此能够应用于更多高层视觉处理的场景。例如在自动驾驶领域,需要辨识摄像头拍摄的图像中的车辆、行人、交通指示牌及其位置,以便进一步根据这些数据决定驾驶策略。上一期学习案例中,我们聚焦于YOLO算法,YOLO(You Only Look Once)是一种one-stage物体检测算法,在本期案例中,我们介绍一种two-stage算法——Faster R-CNN,将目标区域检测和类别识别分为两个任务进行物体检测。注意事项:本案例使用框架**:** Pytorch-1.0.0本案例使用硬件规格**:** 8 vCPU + 64 GiB + 1 x Tesla V100-PCIE-32GB进入运行环境方法:点此链接进入AI Gallery,点击Run in ModelArts按钮进入ModelArts运行环境,如需使用GPU,您可以在ModelArts JupyterLab运行界面右边的工作区进行切换运行代码方法**:** 点击本页面顶部菜单栏的三角形运行按钮或按Ctrl+Enter键 运行每个方块中的代码JupyterLab的详细用法**:** 请参考《ModelAtrs JupyterLab使用指导》碰到问题的解决办法**:** 请参考《ModelAtrs JupyterLab常见问题解决办法》1.数据准备首先,我们将需要的代码和数据下载到Notebook。本案例我们使用PASCAL VOC 2007数据集训练模型,共20个类别的物体。import os from modelarts.session import Session sess = Session() if sess.region_name == 'cn-north-1': bucket_path="modelarts-labs/notebook/DL_object_detection_faster/fasterrcnn.tar.gz" elif sess.region_name == 'cn-north-4': bucket_path="modelarts-labs-bj4/notebook/DL_object_detection_faster/fasterrcnn.tar.gz" else: print("请更换地区到北京一或北京四") if not os.path.exists('./experiments'): sess.download_data(bucket_path=bucket_path, path="./fasterrcnn.tar.gz") if os.path.exists('./fasterrcnn.tar.gz'): # 解压压缩包 os.system("tar -xf ./fasterrcnn.tar.gz") # 清理压缩包 os.system("rm -r ./fasterrcnn.tar.gz")2.安装依赖并引用!pip install pycocotools==2.0.0 !pip install torchvision==0.4.0 !pip install protobuf==3.9.0Requirement already satisfied: pycocotools==2.0.0 in /home/ma-user/anaconda3/envs/Pytorch-1.0.0/lib/python3.6/site-packages You are using pip version 9.0.1, however version 20.3.3 is available. You should consider upgrading via the 'pip install --upgrade pip' command. Requirement already satisfied: torchvision==0.4.0 in /home/ma-user/anaconda3/envs/Pytorch-1.0.0/lib/python3.6/site-packages Requirement already satisfied: torch==1.2.0 in /home/ma-user/anaconda3/envs/Pytorch-1.0.0/lib/python3.6/site-packages (from torchvision==0.4.0) Requirement already satisfied: pillow>=4.1.1 in /home/ma-user/anaconda3/envs/Pytorch-1.0.0/lib/python3.6/site-packages (from torchvision==0.4.0) Requirement already satisfied: numpy in /home/ma-user/anaconda3/envs/Pytorch-1.0.0/lib/python3.6/site-packages (from torchvision==0.4.0) Requirement already satisfied: six in /home/ma-user/anaconda3/envs/Pytorch-1.0.0/lib/python3.6/site-packages (from torchvision==0.4.0) You are using pip version 9.0.1, however version 20.3.3 is available. You should consider upgrading via the 'pip install --upgrade pip' command. Requirement already satisfied: protobuf==3.9.0 in /home/ma-user/anaconda3/envs/Pytorch-1.0.0/lib/python3.6/site-packages Requirement already satisfied: setuptools in /home/ma-user/anaconda3/envs/Pytorch-1.0.0/lib/python3.6/site-packages (from protobuf==3.9.0) Requirement already satisfied: six>=1.9 in /home/ma-user/anaconda3/envs/Pytorch-1.0.0/lib/python3.6/site-packages (from protobuf==3.9.0) You are using pip version 9.0.1, however version 20.3.3 is available. You should consider upgrading via the 'pip install --upgrade pip' command.import tools._init_paths %matplotlib inline from __future__ import absolute_import from __future__ import division from __future__ import print_function import tensorboardX as tb from datasets.factory import get_imdb from model.train_val import get_training_roidb, train_net from model.config import cfg, cfg_from_file, cfg_from_list, get_output_dir, get_output_tb_dirimport roi_data_layer.roidb as rdl_roidb from roi_data_layer.layer import RoIDataLayer import utils.timer import pickle import torch import torch.optim as optim from nets.vgg16 import vgg16 import numpy as np import os import sys import glob import time3.神经网络搭建3.1模型训练超参设置为了减少训练时间,我们在预训练模型的基础上进行训练。这里,我们使用VGG16作为FasterRCNN的主干网络。imdb_name = "voc_2007_trainval" imdbval_name = "voc_2007_test" # 使用的预训练模型位置 weight = "./data/imagenet_weights/vgg16.pth" # 训练迭代次数 max_iters = 100 # cfg模型文件位置 cfg_file = './experiments/cfgs/vgg16.yml' set_cfgs = None if cfg_file is not None: cfg_from_file(cfg_file) if set_cfgs is not None: cfg_from_list(set_cfgs) print('Using config:') print(cfg)Using config: {'TRAIN': {'LEARNING_RATE': 0.001, 'MOMENTUM': 0.9, 'WEIGHT_DECAY': 0.0001, 'GAMMA': 0.1, 'STEPSIZE': [30000], 'DISPLAY': 20, 'DOUBLE_BIAS': True, 'TRUNCATED': False, 'BIAS_DECAY': False, 'USE_GT': False, 'ASPECT_GROUPING': False, 'SNAPSHOT_KEPT': 3, 'SUMMARY_INTERVAL': 180, 'SCALES': [600], 'MAX_SIZE': 1000, 'IMS_PER_BATCH': 1, 'BATCH_SIZE': 256, 'FG_FRACTION': 0.25, 'FG_THRESH': 0.5, 'BG_THRESH_HI': 0.5, 'BG_THRESH_LO': 0.0, 'USE_FLIPPED': True, 'BBOX_REG': True, 'BBOX_THRESH': 0.5, 'SNAPSHOT_ITERS': 5000, 'SNAPSHOT_PREFIX': 'vgg16_faster_rcnn', 'BBOX_NORMALIZE_TARGETS': True, 'BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0], 'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True, 'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0], 'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2], 'PROPOSAL_METHOD': 'gt', 'HAS_RPN': True, 'RPN_POSITIVE_OVERLAP': 0.7, 'RPN_NEGATIVE_OVERLAP': 0.3, 'RPN_CLOBBER_POSITIVES': False, 'RPN_FG_FRACTION': 0.5, 'RPN_BATCHSIZE': 256, 'RPN_NMS_THRESH': 0.7, 'RPN_PRE_NMS_TOP_N': 12000, 'RPN_POST_NMS_TOP_N': 2000, 'RPN_BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0], 'RPN_POSITIVE_WEIGHT': -1.0, 'USE_ALL_GT': True}, 'TEST': {'SCALES': [600], 'MAX_SIZE': 1000, 'NMS': 0.3, 'SVM': False, 'BBOX_REG': True, 'HAS_RPN': True, 'PROPOSAL_METHOD': 'gt', 'RPN_NMS_THRESH': 0.7, 'RPN_PRE_NMS_TOP_N': 6000, 'RPN_POST_NMS_TOP_N': 300, 'MODE': 'nms', 'RPN_TOP_N': 5000}, 'RESNET': {'MAX_POOL': False, 'FIXED_BLOCKS': 1}, 'MOBILENET': {'REGU_DEPTH': False, 'FIXED_LAYERS': 5, 'WEIGHT_DECAY': 4e-05, 'DEPTH_MULTIPLIER': 1.0}, 'PIXEL_MEANS': array([[[102.9801, 115.9465, 122.7717]]]), 'RNG_SEED': 3, 'ROOT_DIR': '/home/ma-user/work', 'DATA_DIR': '/home/ma-user/work/data', 'MATLAB': 'matlab', 'EXP_DIR': 'vgg16', 'USE_GPU_NMS': True, 'POOLING_MODE': 'align', 'POOLING_SIZE': 7, 'ANCHOR_SCALES': [8, 16, 32], 'ANCHOR_RATIOS': [0.5, 1, 2], 'RPN_CHANNELS': 512}3.2定义读取数据集函数数据集的标注格式是PASCAL VOC格式。def combined_roidb(imdb_names): def get_roidb(imdb_name): # 加载数据集 imdb = get_imdb(imdb_name) print('Loaded dataset `{:s}` for training'.format(imdb.name)) # 使用ground truth作为数据集策略 imdb.set_proposal_method(cfg.TRAIN.PROPOSAL_METHOD) print('Set proposal method: {:s}'.format(cfg.TRAIN.PROPOSAL_METHOD)) roidb = get_training_roidb(imdb) return roidb roidbs = [get_roidb(s) for s in imdb_names.split('+')] roidb = roidbs[0] if len(roidbs) > 1: for r in roidbs[1:]: roidb.extend(r) tmp = get_imdb(imdb_names.split('+')[1]) imdb = datasets.imdb.imdb(imdb_names, tmp.classes) else: imdb = get_imdb(imdb_names) return imdb, roidb3.3设置模型训练参数np.random.seed(cfg.RNG_SEED) # 加载训练数据集 imdb, roidb = combined_roidb(imdb_name) print('{:d} roidb entries'.format(len(roidb))) # 设置输出路径 output_dir = get_output_dir(imdb,None) print('Output will be saved to `{:s}`'.format(output_dir)) # 设置日志保存路径 tb_dir = get_output_tb_dir(imdb, None) print('TensorFlow summaries will be saved to `{:s}`'.format(tb_dir)) # 加载验证数据集 orgflip = cfg.TRAIN.USE_FLIPPED cfg.TRAIN.USE_FLIPPED = False _, valroidb = combined_roidb(imdbval_name) print('{:d} validation roidb entries'.format(len(valroidb))) cfg.TRAIN.USE_FLIPPED = orgflip # 创建backbone网络 # 在案例中使用的是VGG16模型,可以尝试其他不同的模型结构,例如Resnet等 net = vgg16()Loaded dataset `voc_2007_trainval` for training Set proposal method: gt Appending horizontally-flipped training examples... voc_2007_trainval gt roidb loaded from /home/ma-user/work/data/cache/voc_2007_trainval_gt_roidb.pkl done Preparing training data... done 10022 roidb entries Output will be saved to `/home/ma-user/work/output/vgg16/voc_2007_trainval/default` TensorFlow summaries will be saved to `/home/ma-user/work/tensorboard/vgg16/voc_2007_trainval/default` Loaded dataset `voc_2007_test` for training Set proposal method: gt Preparing training data... voc_2007_test gt roidb loaded from /home/ma-user/work/data/cache/voc_2007_test_gt_roidb.pkl done 4952 validation roidb entriesfrom model.train_val import filter_roidb, SolverWrapper # 对ROI进行筛选,将无效的ROI数据筛选掉 roidb = filter_roidb(roidb) valroidb = filter_roidb(valroidb) sw = SolverWrapper( net, imdb, roidb, valroidb, output_dir, tb_dir, pretrained_model=weight) print('Solving...')Filtered 0 roidb entries: 10022 -> 10022 Filtered 0 roidb entries: 4952 -> 4952 Solving...# 显示所有模型属性 sw.__dict__.keys()dict_keys(['net', 'imdb', 'roidb', 'valroidb', 'output_dir', 'tbdir', 'tbvaldir', 'pretrained_model'])# sw.net为主干网络 print(sw.net)vgg16()3.4定义神经网络结构使用PyTorch搭建神经网络。部分实现细节可以去相应的文件夹查看源码。# 构建网络结构,模型加入ROI数据层 sw.data_layer = RoIDataLayer(sw.roidb, sw.imdb.num_classes) sw.data_layer_val = RoIDataLayer(sw.valroidb, sw.imdb.num_classes, random=True) # 构建网络结构,在VGG16基础上加入ROI和Classifier部分 lr, train_op = sw.construct_graph() # 加载之前的snapshot lsf, nfiles, sfiles = sw.find_previous() # snapshot 为训练提供了断点训练,如果有snapshot将加载进来,继续训练 if lsf == 0: lr, last_snapshot_iter, stepsizes, np_paths, ss_paths = sw.initialize() else: lr, last_snapshot_iter, stepsizes, np_paths, ss_paths = sw.restore(str(sfiles[-1]), str(nfiles[-1])) iter = last_snapshot_iter + 1 last_summary_time = time.time() # 在之前的训练基础上继续进行训练 stepsizes.append(max_iters) stepsizes.reverse() next_stepsize = stepsizes.pop() # 将net切换成训练模式 print("网络结构:") sw.net.train() sw.net.to(sw.net._device)Restoring model snapshots from /home/ma-user/work/output/vgg16/voc_2007_trainval/default/vgg16_faster_rcnn_iter_100.pth Restored. 网络结构: vgg16( (vgg): VGG( (features): Sequential( (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): ReLU(inplace=True) (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (3): ReLU(inplace=True) (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (6): ReLU(inplace=True) (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (8): ReLU(inplace=True) (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (11): ReLU(inplace=True) (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (13): ReLU(inplace=True) (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (15): ReLU(inplace=True) (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (18): ReLU(inplace=True) (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (20): ReLU(inplace=True) (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (22): ReLU(inplace=True) (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (25): ReLU(inplace=True) (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (27): ReLU(inplace=True) (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (29): ReLU(inplace=True) (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) ) (avgpool): AdaptiveAvgPool2d(output_size=(7, 7)) (classifier): Sequential( (0): Linear(in_features=25088, out_features=4096, bias=True) (1): ReLU(inplace=True) (2): Dropout(p=0.5, inplace=False) (3): Linear(in_features=4096, out_features=4096, bias=True) (4): ReLU(inplace=True) (5): Dropout(p=0.5, inplace=False) ) ) (rpn_net): Conv2d(512, 512, kernel_size=[3, 3], stride=(1, 1), padding=(1, 1)) (rpn_cls_score_net): Conv2d(512, 18, kernel_size=[1, 1], stride=(1, 1)) (rpn_bbox_pred_net): Conv2d(512, 36, kernel_size=[1, 1], stride=(1, 1)) (cls_score_net): Linear(in_features=4096, out_features=21, bias=True) (bbox_pred_net): Linear(in_features=4096, out_features=84, bias=True) )3.5开始训练while iter < max_iters + 1: if iter == next_stepsize + 1: # 加入snapshot节点 sw.snapshot(iter) lr *= cfg.TRAIN.GAMMA scale_lr(sw.optimizer, cfg.TRAIN.GAMMA) next_stepsize = stepsizes.pop() utils.timer.timer.tic() # 数据通过ROI数据层,进行前向计算 blobs = sw.data_layer.forward() now = time.time() if iter == 1 or now - last_summary_time > cfg.TRAIN.SUMMARY_INTERVAL: # 计算loss函数 # 根据loss函数对模型进行训练 rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, total_loss, summary = \ sw.net.train_step_with_summary(blobs, sw.optimizer) for _sum in summary: sw.writer.add_summary(_sum, float(iter)) # 进行数据层验证计算 blobs_val = sw.data_layer_val.forward() summary_val = sw.net.get_summary(blobs_val) for _sum in summary_val: sw.valwriter.add_summary(_sum, float(iter)) last_summary_time = now else: rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, total_loss = \ sw.net.train_step(blobs, sw.optimizer) utils.timer.timer.toc() if iter % (cfg.TRAIN.DISPLAY) == 0: print('iter: %d / %d, total loss: %.6f\n >>> rpn_loss_cls: %.6f\n ' '>>> rpn_loss_box: %.6f\n >>> loss_cls: %.6f\n >>> loss_box: %.6f\n >>> lr: %f' % \ (iter, max_iters, total_loss, rpn_loss_cls, rpn_loss_box, loss_cls, loss_box, lr)) print('speed: {:.3f}s / iter'.format( utils.timer.timer.average_time())) # 进行snapshot存储 if iter % cfg.TRAIN.SNAPSHOT_ITERS == 0: last_snapshot_iter = iter ss_path, np_path = sw.snapshot(iter) np_paths.append(np_path) ss_paths.append(ss_path) # 删掉多余的snapshot if len(np_paths) > cfg.TRAIN.SNAPSHOT_KEPT: sw.remove_snapshot(np_paths, ss_paths) iter += 1 if last_snapshot_iter != iter - 1: sw.snapshot(iter - 1) sw.writer.close() sw.valwriter.close()4.测试部分在这部分中,我们利用训练得到的模型进行推理测试。%matplotlib inline from __future__ import absolute_import from __future__ import division from __future__ import print_function # 将路径转入lib import tools._init_paths from model.config import cfg from model.test import im_detect from torchvision.ops import nms from utils.timer import Timer import matplotlib.pyplot as plt import numpy as np import os, cv2 import argparse from nets.vgg16 import vgg16 from nets.resnet_v1 import resnetv1 from model.bbox_transform import clip_boxes, bbox_transform_inv import torch4.1参数定义# PASCAL VOC类别设置 CLASSES = ('__background__', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor') # 网络模型文件名定义 NETS = {'vgg16': ('vgg16_faster_rcnn_iter_%d.pth',),'res101': ('res101_faster_rcnn_iter_%d.pth',)} # 数据集文件名定义 DATASETS= {'pascal_voc': ('voc_2007_trainval',),'pascal_voc_0712': ('voc_2007_trainval+voc_2012_trainval',)}4.2结果绘制将预测的标签和边界框绘制在原图上。def vis_detections(im, class_dets, thresh=0.5): """Draw detected bounding boxes.""" im = im[:, :, (2, 1, 0)] fig, ax = plt.subplots(figsize=(12, 12)) ax.imshow(im, aspect='equal') for class_name in class_dets: dets = class_dets[class_name] inds = np.where(dets[:, -1] >= thresh)[0] if len(inds) == 0: continue for i in inds: bbox = dets[i, :4] score = dets[i, -1] ax.add_patch( plt.Rectangle((bbox[0], bbox[1]), bbox[2] - bbox[0], bbox[3] - bbox[1], fill=False, edgecolor='red', linewidth=3.5) ) ax.text(bbox[0], bbox[1] - 2, '{:s} {:.3f}'.format(class_name, score), bbox=dict(facecolor='blue', alpha=0.5), fontsize=14, color='white') plt.axis('off') plt.tight_layout() plt.draw()4.3准备测试图片我们将测试图片传到test文件夹下,我们准备了两张图片进行测试,大家也可以通过notebook的upload按钮上传自己的测试数据。注意,测试数据需要是图片,并且放在test文件夹下。test_file = "./test"4.4模型推理这里我们加载一个预先训练好的模型,也可以选择案例中训练的模型。import cv2 from utils.timer import Timer from model.test import im_detect from torchvision.ops import nms cfg.TEST.HAS_RPN = True # Use RPN for proposals # 模型存储位置 # 这里我们加载一个已经训练110000迭代之后的模型,可以选择自己的训练模型位置 saved_model = "./models/vgg16-voc0712/vgg16_faster_rcnn_iter_110000.pth" print('trying to load weights from ', saved_model) # 加载backbone net = vgg16() # 构建网络 net.create_architecture(21, tag='default', anchor_scales=[8, 16, 32]) # 加载权重文件 net.load_state_dict(torch.load(saved_model, map_location=lambda storage, loc: storage)) net.eval() # 选择推理设备 net.to(net._device) print('Loaded network {:s}'.format(saved_model)) for file in os.listdir(test_file): if file.startswith("._") == False: file_path = os.path.join(test_file, file) print(file_path) # 打开测试图片文件 im = cv2.imread(file_path) # 定义计时器 timer = Timer() timer.tic() # 检测得到图片ROI scores, boxes = im_detect(net, im) print(scores.shape, boxes.shape) timer.toc() print('Detection took {:.3f}s for {:d} object proposals'.format(timer.total_time(), boxes.shape[0])) # 定义阈值 CONF_THRESH = 0.7 NMS_THRESH = 0.3 cls_dets = {} # NMS 非极大值抑制操作,过滤边界框 for cls_ind, cls in enumerate(CLASSES[1:]): cls_ind += 1 # 跳过 background cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)] cls_scores = scores[:, cls_ind] dets = np.hstack((cls_boxes, cls_scores[:, np.newaxis])).astype(np.float32) keep = nms(torch.from_numpy(cls_boxes), torch.from_numpy(cls_scores), NMS_THRESH) dets = dets[keep.numpy(), :] if len(dets) > 0: if cls in cls_dets: cls_dets[cls] = np.vstack([cls_dets[cls], dets]) else: cls_dets[cls] = dets vis_detections(im, cls_dets, thresh=CONF_THRESH) plt.show()trying to load weights from ./models/vgg16-voc0712/vgg16_faster_rcnn_iter_110000.pth Loaded network ./models/vgg16-voc0712/vgg16_faster_rcnn_iter_110000.pth ./test/test_image_1.jpg (300, 21) (300, 84) Detection took 0.042s for 300 object proposals./test/test_image_0.jpg (300, 21) (300, 84) Detection took 0.039s for 300 object proposals
  • [技术干货] 实例分割-Mask R-CNN 模型
    实例分割-Mask R-CNN 模型本案例我们将进行实例分割模型Mask R-CNN的训练和测试的学习。在计算机视觉领域,实例分割(Instance Segmentation)是指从图像中识别物体的各个实例,并逐个将实例进行像素级标注的任务。实例分割技术在自动驾驶、医学影像、高精度GIS识别、3D建模辅助等领域有广泛的应用。本案例将对实例分割领域经典的Mask R-CNN模型进行简单介绍,并使用Matterport开源Mask R-CNN实现,展示如何在华为云ModelArts上训练Mask R-CNN模型。点击跳转至Mask R-CNN模型详解注意事项:本案例使用框架**:** TensorFlow-1.13.1本案例使用硬件规格**:** 8 vCPU + 64 GiB + 1 x Tesla V100-PCIE-32GB进入运行环境方法:点此链接进入AI Gallery,点击Run in ModelArts按钮进入ModelArts运行环境,如需使用GPU,您可以在ModelArts JupyterLab运行界面右边的工作区进行切换运行代码方法**:** 点击本页面顶部菜单栏的三角形运行按钮或按Ctrl+Enter键 运行每个方块中的代码JupyterLab的详细用法**:** 请参考《ModelAtrs JupyterLab使用指导》碰到问题的解决办法**:** 请参考《ModelAtrs JupyterLab常见问题解决办法》1.首先进行包的安装与引用!pip install pycocotools==2.0.0Collecting pycocotools==2.0.0 Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/96/84/9a07b1095fd8555ba3f3d519517c8743c2554a245f9476e5e39869f948d2/pycocotools-2.0.0.tar.gz (1.5MB)  100% |████████████████████████████████| 1.5MB 52.3MB/s ta 0:00:01 [?25hBuilding wheels for collected packages: pycocotools Running setup.py bdist_wheel for pycocotools ... [?25ldone [?25h Stored in directory: /home/ma-user/.cache/pip/wheels/63/72/9e/bac3d3e23f6b04351d200fa892351da57f0e68c7aeec0b1b08 Successfully built pycocotools Installing collected packages: pycocotools Successfully installed pycocotools-2.0.0 You are using pip version 9.0.1, however version 21.0.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command.!pip install imgaug==0.2.9Collecting imgaug==0.2.9 Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/17/a9/36de8c0e1ffb2d86f871cac60e5caa910cbbdb5f4741df5ef856c47f4445/imgaug-0.2.9-py2.py3-none-any.whl (753kB)  100% |████████████████████████████████| 757kB 83.4MB/s ta 0:00:01 91% |█████████████████████████████▏ | 686kB 83.9MB/s eta 0:00:01 [?25hRequirement already satisfied: numpy>=1.15.0 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from imgaug==0.2.9) Requirement already satisfied: opencv-python in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from imgaug==0.2.9) Requirement already satisfied: matplotlib in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from imgaug==0.2.9) Requirement already satisfied: Pillow in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from imgaug==0.2.9) Requirement already satisfied: scikit-image>=0.11.0 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from imgaug==0.2.9) Requirement already satisfied: six in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from imgaug==0.2.9) Collecting Shapely (from imgaug==0.2.9) Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/9d/18/557d4f55453fe00f59807b111cc7b39ce53594e13ada88e16738fb4ff7fb/Shapely-1.7.1-cp36-cp36m-manylinux1_x86_64.whl (1.0MB)  100% |████████████████████████████████| 1.0MB 40.5MB/s ta 0:00:01 [?25hRequirement already satisfied: imageio in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from imgaug==0.2.9) Requirement already satisfied: scipy in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from imgaug==0.2.9) Requirement already satisfied: python-dateutil>=2.1 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from matplotlib->imgaug==0.2.9) Requirement already satisfied: pytz in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from matplotlib->imgaug==0.2.9) Requirement already satisfied: kiwisolver>=1.0.1 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from matplotlib->imgaug==0.2.9) Requirement already satisfied: cycler>=0.10 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from matplotlib->imgaug==0.2.9) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from matplotlib->imgaug==0.2.9) Requirement already satisfied: cloudpickle>=0.2.1 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from scikit-image>=0.11.0->imgaug==0.2.9) Requirement already satisfied: PyWavelets>=0.4.0 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from scikit-image>=0.11.0->imgaug==0.2.9) Requirement already satisfied: networkx>=1.8 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from scikit-image>=0.11.0->imgaug==0.2.9) Requirement already satisfied: decorator>=4.1.0 in /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages (from networkx>=1.8->scikit-image>=0.11.0->imgaug==0.2.9) Installing collected packages: Shapely, imgaug Found existing installation: imgaug 0.2.6 Uninstalling imgaug-0.2.6: Successfully uninstalled imgaug-0.2.6 Successfully installed Shapely-1.7.1 imgaug-0.2.9 You are using pip version 9.0.1, however version 21.0.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command.2.对需要的代码和数据进行下载import os from modelarts.session import Session session = Session() if session.region_name == 'cn-north-1': bucket_path="modelarts-labs/end2end/mask_rcnn/instance_segmentation.tar.gz" elif session.region_name == 'cn-north-4': bucket_path="modelarts-labs-bj4/end2end/mask_rcnn/instance_segmentation.tar.gz" else: print("请更换地区到北京一或北京四") if not os.path.exists('./src/mrcnn'): session.download_data(bucket_path=bucket_path, path='./instance_segmentation.tar.gz') if os.path.exists('./instance_segmentation.tar.gz'): # 使用tar命令解压资源包 os.system("tar zxf ./instance_segmentation.tar.gz") # 清理压缩包 os.system("rm ./instance_segmentation.tar.gz")Successfully download file modelarts-labs-bj4/end2end/mask_rcnn/instance_segmentation.tar.gz from OBS to local ./instance_segmentation.tar.gz3.Mask R-CNN模型训练部分3.1 第一步:导入相应的Python库,准备预训练模型import sys import random import math import re import time import numpy as np import cv2 import matplotlib import matplotlib.pyplot as plt from src.mrcnn.config import Config from src.mrcnn import utils import src.mrcnn.model as modellib from src.mrcnn import visualize from src.mrcnn.model import log %matplotlib inline # Directory to save logs and trained model MODEL_DIR = "logs" # Local path to trained weights file COCO_MODEL_PATH = "data/mask_rcnn_coco.h5"/home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Using TensorFlow backend.3.2 第二步:生成相关配置项我们定义Config类的子类MyTrainConfig,指定相关的参数,较为关键的参数有:__NAME__: Config的唯一名称__NUM_CLASSES__: 分类的数量,COCO中共有80种物体+背景__IMAGE_MIN_DIM和IMAGE_MAX_DIM__: 图片的最大和最小尺寸,我们生成固定的128x128的图片,因此都设置为128__TRAIN_ROIS_PER_IMAGE__: 每张图片上训练的RoI个数__STEPS_PER_EPOCH和VALIDATION_STEPS__: 训练和验证时,每轮的step数量,减少step的数量可以加速训练,但是检测精度降低class MyTrainConfig(Config): # 可辨识的名称 NAME = "my_train" # GPU的数量和每个GPU处理的图片数量,可以根据实际情况进行调整,参考为Nvidia Tesla P100 GPU_COUNT = 1 IMAGES_PER_GPU = 1 # 物体的分类个数,COCO中共有80种物体+背景 NUM_CLASSES = 1 + 80 # background + 80 shapes # 图片尺寸统一处理为1024,可以根据实际情况再进一步调小 IMAGE_MIN_DIM = 1024 IMAGE_MAX_DIM = 1024 # 因为我们生成的形状图片较小,这里可以使用较小的Anchor进行RoI检测 # RPN_ANCHOR_SCALES = (8, 16, 32, 64, 128) # anchor side in pixels # 每张图片上训练的RoI个数,因为我们生成的图片较小,而且每张图片上的形状较少 # 因此可以适当调小该参数,用较少的Anchor即可覆盖大致的物体信息 TRAIN_ROIS_PER_IMAGE = 200 # 每轮训练的step数量 STEPS_PER_EPOCH = 100 # 每轮验证的step数量 VALIDATION_STEPS = 20 config = MyTrainConfig() config.display()Configurations: BACKBONE resnet101 BACKBONE_STRIDES [4, 8, 16, 32, 64] BATCH_SIZE 1 BBOX_STD_DEV [0.1 0.1 0.2 0.2] COMPUTE_BACKBONE_SHAPE None DETECTION_MAX_INSTANCES 100 DETECTION_MIN_CONFIDENCE 0.7 DETECTION_NMS_THRESHOLD 0.3 FPN_CLASSIF_FC_LAYERS_SIZE 1024 GPU_COUNT 1 GRADIENT_CLIP_NORM 5.0 IMAGES_PER_GPU 1 IMAGE_CHANNEL_COUNT 3 IMAGE_MAX_DIM 1024 IMAGE_META_SIZE 93 IMAGE_MIN_DIM 1024 IMAGE_MIN_SCALE 0 IMAGE_RESIZE_MODE square IMAGE_SHAPE [1024 1024 3] LEARNING_MOMENTUM 0.9 LEARNING_RATE 0.001 LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0} MASK_POOL_SIZE 14 MASK_SHAPE [28, 28] MAX_GT_INSTANCES 100 MEAN_PIXEL [123.7 116.8 103.9] MINI_MASK_SHAPE (56, 56) NAME my_train NUM_CLASSES 81 POOL_SIZE 7 POST_NMS_ROIS_INFERENCE 1000 POST_NMS_ROIS_TRAINING 2000 PRE_NMS_LIMIT 6000 ROI_POSITIVE_RATIO 0.33 RPN_ANCHOR_RATIOS [0.5, 1, 2] RPN_ANCHOR_SCALES (32, 64, 128, 256, 512) RPN_ANCHOR_STRIDE 1 RPN_BBOX_STD_DEV [0.1 0.1 0.2 0.2] RPN_NMS_THRESHOLD 0.7 RPN_TRAIN_ANCHORS_PER_IMAGE 256 STEPS_PER_EPOCH 100 TOP_DOWN_PYRAMID_SIZE 256 TRAIN_BN False TRAIN_ROIS_PER_IMAGE 200 USE_MINI_MASK True USE_RPN_ROIS True VALIDATION_STEPS 20 WEIGHT_DECAY 0.00013.3 第三步:准备数据集我们使用封装好的CocoDataset类,生成训练集和验证集。from src.mrcnn.coco import CocoDataset COCO_DIR = 'data' # 生成训练集 dataset_train = CocoDataset() dataset_train.load_coco(COCO_DIR, "train") # 加载训练数据集 dataset_train.prepare()loading annotations into memory... Done (t=0.04s) creating index... index created!# 生成验证集 dataset_val = CocoDataset() dataset_val.load_coco(COCO_DIR, "val") # 加载验证数据集 dataset_val.prepare()loading annotations into memory... Done (t=0.17s) creating index... index created!4.创建模型4.1 第一步:用"training"模式创建模型对象,用于形状数据集的训练model = modellib.MaskRCNN(mode="training", config=config, model_dir=MODEL_DIR)WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. [DEBUG] <__main__.MyTrainConfig object at 0x7f9b6edc7c50> [DEBUG] Tensor("rpn_class/concat:0", shape=(?, ?, 2), dtype=float32) Tensor("rpn_bbox_1/concat:0", shape=(?, ?, 4), dtype=float32) <tf.Variable 'anchors/Variable:0' shape=(1, 261888, 4) dtype=float32_ref>4.2 第二步:加载预训练模型的权重model.load_weights(COCO_MODEL_PATH, by_name=True)接下来,我们使用预训练的模型,结合Shapes数据集,对模型进行训练5.训练模型Keras中的模型可以按照制定的层进行构建,在模型的train方法中,我们可以通过layers参数来指定特定的层进行训练。layers参数有以下几种预设值:heads:只训练head网络中的分类、mask和bbox回归all: 所有的layer3+: 训练ResNet Stage3和后续Stage4+: 训练ResNet Stage4和后续Stage5+: 训练ResNet Stage5和后续Stage此外,layers参数还支持正则表达式,按照匹配规则指定layer,可以调用model.keras_model.summary()查看各个层的名称,然后按照需要指定要训练的层。下面的步骤对所有的layer训练1个epoch,耗时约4分钟model.train(dataset_train, dataset_val, learning_rate=config.LEARNING_RATE, epochs=1, layers='all') model_savepath = 'my_mrcnn_model.h5' model.keras_model.save_weights(model_savepath)Starting at epoch 0. LR=0.001 Checkpoint Path: logs/my_train20210309T1458/mask_rcnn_my_train_{epoch:04d}.h5 WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py:110: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. " /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/keras/engine/training_generator.py:47: UserWarning: Using a generator with `use_multiprocessing=True` and multiple workers may duplicate your data. Please consider using the`keras.utils.Sequence class. UserWarning('Using a generator with `use_multiprocessing=True`' Epoch 1/1 100/100 [==============================] - 111s 1s/step - loss: 0.4283 - rpn_class_loss: 0.0090 - rpn_bbox_loss: 0.0787 - mrcnn_class_loss: 0.0627 - mrcnn_bbox_loss: 0.0758 - mrcnn_mask_loss: 0.2021 - val_loss: 0.4290 - val_rpn_class_loss: 0.0100 - val_rpn_bbox_loss: 0.1086 - val_mrcnn_class_loss: 0.0920 - val_mrcnn_bbox_loss: 0.0539 - val_mrcnn_mask_loss: 0.16456.使用Mask R-CNN 检测图片物体6.1 第一步:定义InferenceConfig,并创建"Inference"模式的模型对象class InferenceConfig(MyTrainConfig): GPU_COUNT = 1 IMAGES_PER_GPU = 1 inference_config = InferenceConfig() inference_model = modellib.MaskRCNN(mode="inference", config=inference_config, model_dir=MODEL_DIR)[DEBUG] <__main__.InferenceConfig object at 0x7f9681f59710> [DEBUG] Tensor("rpn_class_1/concat:0", shape=(?, ?, 2), dtype=float32) Tensor("rpn_bbox_3/concat:0", shape=(?, ?, 4), dtype=float32) Tensor("input_anchors:0", shape=(?, ?, 4), dtype=float32) WARNING:tensorflow:From /home/ma-user/work/case_dev/mask_rcnn/src/mrcnn/model.py:772: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead.将我们生成的模型权重信息加载进来# 加载我们自己训练出的形状模型文件的权重信息 print("Loading weights from ", model_savepath) inference_model.load_weights(model_savepath, by_name=True)Loading weights from my_mrcnn_model.h56.2 第二步:从验证数据集中随机选出一张图片进行预测,并显示结果# 随机选出图片进行测试 image_id = random.choice(dataset_val.image_ids) original_image, image_meta, gt_class_id, gt_bbox, gt_mask =\ modellib.load_image_gt(dataset_val, inference_config, image_id, use_mini_mask=False) log("original_image", original_image) log("image_meta", image_meta) log("gt_class_id", gt_class_id) log("gt_bbox", gt_bbox) log("gt_mask", gt_mask) det_instances_savepath = 'random.det_instances.jpg' visualize.display_instances(original_image, gt_bbox, gt_mask, gt_class_id, dataset_train.class_names, figsize=(8, 8), save_path=det_instances_savepath)original_image shape: (1024, 1024, 3) min: 0.00000 max: 255.00000 uint8 image_meta shape: (93,) min: 0.00000 max: 1024.00000 float64 gt_class_id shape: (17,) min: 1.00000 max: 74.00000 int32 gt_bbox shape: (17, 4) min: 1.00000 max: 1024.00000 int32 gt_mask shape: (1024, 1024, 17) min: 0.00000 max: 1.00000 bool# 定义助手函数用于设置matplot中的子绘制区域所在的行和列 def get_ax(rows=1, cols=1, size=8): _, ax = plt.subplots(rows, cols, figsize=(size*cols, size*rows)) return ax results = inference_model.detect([original_image], verbose=1) r = results[0] prediction_savepath = 'random.prediction.jpg' visualize.display_instances(original_image, r['rois'], r['masks'], r['class_ids'], dataset_val.class_names, r['scores'], ax=get_ax(), save_path=prediction_savepath)Processing 1 images image shape: (1024, 1024, 3) min: 0.00000 max: 255.00000 uint8 molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151.10000 float64 image_metas shape: (1, 93) min: 0.00000 max: 1024.00000 int64 anchors shape: (1, 261888, 4) min: -0.35390 max: 1.29134 float326.3 第三步:测试其他图片。本目录下的data/val2014目录下有很多测试图片,修改下面代码中test_path变量右边的文件名,即可更换为不同图片,测试图片的预测效果。test_path = './data/val2014/COCO_val2014_000000019176.jpg'import skimage.io image = skimage.io.imread(test_path) results = inference_model.detect([image], verbose=1) r = results[0] prediction_savepath = 'self.prediction.jpg' visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], dataset_val.class_names, r['scores'], ax=get_ax(), save_path=prediction_savepath)Processing 1 images image shape: (480, 640, 3) min: 0.00000 max: 255.00000 uint8 molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151.10000 float64 image_metas shape: (1, 93) min: 0.00000 max: 1024.00000 float64 anchors shape: (1, 261888, 4) min: -0.35390 max: 1.29134 float327.评估模型这一步我们对自己训练出的模型进行一个简单的评估。计算模型的平均精度mAP(mean Average Precision)# 计算VOC类型的 mAP,条件是 IoU=0.5 # 下面的示例中只选出10张图片进行评估,增加图片数量可以增加模型评估的准确性 image_ids = np.random.choice(dataset_val.image_ids, 10) APs = [] for image_id in image_ids: # Load image and ground truth data image, image_meta, gt_class_id, gt_bbox, gt_mask =\ modellib.load_image_gt(dataset_val, inference_config, image_id, use_mini_mask=False) molded_images = np.expand_dims(modellib.mold_image(image, inference_config), 0) # Run object detection results = inference_model.detect([image], verbose=0) r = results[0] # Compute AP AP, precisions, recalls, overlaps =\ utils.compute_ap(gt_bbox, gt_class_id, gt_mask, r["rois"], r["class_ids"], r["scores"], r['masks']) APs.append(AP) print("mAP: ", np.mean(APs))mAP: 0.6203394930987131本案例结束。
  • [技术干货] 基于计算机视觉的钢筋条数检测
    使用摄像头进行钢筋盘点案例内容介绍中国的各施工工地每年都要使用大量的钢筋,一车钢筋运到工地现场需要工作人员进行盘点,通常的做法是靠人工一根根数的方式,非常耗时费力。为了提高钢筋盘点效率,业界提出了对钢筋图片进行拍照,然后使用AI算法检测图片中的钢筋条数,实践证明,该方案不仅准确率高,而且可以极大提高效率。本案例基于目标检测的方法,使用250张已经人工标注好的图片进行AI模型的训练,训练25分钟,即可检测出图片中钢筋的横截面,从而统计出钢筋的条数。注意事项本案例推荐使用AI框架:Pytorch-1.0.0;进入运行环境方法:点此链接 进入AI Gallery,点击Run in ModelArts按钮进入ModelArts运行环境,如需使用GPU,可查看《ModelArts JupyterLab 硬件规格使用指南》了解切换硬件规格的方法;如果您是第一次使用 JupyterLab,请查看《ModelArts JupyterLab使用指导》了解使用方法;如果您在使用 JupyterLab 过程中碰到报错,请参考《ModelArts JupyterLab常见问题解决办法》尝试解决问题。实验步骤1.数钢筋案例开始 - 下载代码和数据集import os if not os.path.exists('./rebar_count'): print('Downloading code and datasets...') os.system("wget -N https://modelarts-labs-bj4-v2.obs.cn-north-4.myhuaweicloud.com/notebook/DL_rebar_count/rebar_count_code.zip") os.system("wget -N https://cnnorth4-modelhub-datasets-obsfs-sfnua.obs.cn-north-4.myhuaweicloud.com/content/c2c1853f-d6a6-4c9d-ac0e-203d4c304c88/NkxX5K/dataset/rebar_count_datasets.zip") os.system("unzip rebar_count_code.zip; rm rebar_count_code.zip") os.system("unzip -q rebar_count_datasets.zip; rm rebar_count_datasets.zip") os.system("mv rebar_count_code rebar_count; mv rebar_count_datasets rebar_count/datasets") if os.path.exists('./rebar_count'): print('Download code and datasets success') else: print('Download code and datasets failed, please check the download url is valid or not.') else: print('./rebar_count already exists')./rebar_count already exists2.加载需要的python模块import os import sys sys.path.insert(0, './rebar_count/src') import cv2 import time import random import torch import numpy as np from PIL import Image, ImageDraw import xml.etree.ElementTree as ET from datetime import datetime from collections import OrderedDict import torch.optim as optim import torch.utils.data as data import torch.backends.cudnn as cudnn from data import VOCroot, VOC_Config, AnnotationTransform, VOCDetection, detection_collate, BaseTransform, preproc from models.RFB_Net_vgg import build_net from layers.modules import MultiBoxLoss from layers.functions import Detect, PriorBox from utils.visualize import * from utils.nms_wrapper import nms from utils.timer import Timer import matplotlib.pyplot as plt %matplotlib inline ROOT_DIR = os.getcwd() seed = 0 cudnn.benchmark = False cudnn.deterministic = True torch.manual_seed(seed) # 为CPU设置随机种子 torch.cuda.manual_seed_all(seed) # 为所有GPU设置随机种子 random.seed(seed) np.random.seed(seed) os.environ['PYTHONHASHSEED'] = str(seed) # 设置hash随机种子3.查看训练数据样例def read_xml(xml_path): '''读取xml标签''' tree = ET.parse(xml_path) root = tree.getroot() boxes = [] labels = [] for element in root.findall('object'): label = element.find('name').text if label == 'steel': bndbox = element.find('bndbox') xmin = bndbox.find('xmin').text ymin = bndbox.find('ymin').text xmax = bndbox.find('xmax').text ymax = bndbox.find('ymax').text boxes.append([xmin, ymin, xmax, ymax]) labels.append(label) return np.array(boxes, dtype=np.float64), labels4.显示原图和标注框train_img_dir = './rebar_count/datasets/VOC2007/JPEGImages' train_xml_dir = './rebar_count/datasets/VOC2007/Annotations' files = os.listdir(train_img_dir) files.sort() for index, file_name in enumerate(files[:2]): img_path = os.path.join(train_img_dir, file_name) xml_path = os.path.join(train_xml_dir, file_name.split('.jpg')[0] + '.xml') boxes, labels = read_xml(xml_path) img = Image.open(img_path) resize_scale = 2048.0 / max(img.size) img = img.resize((int(img.size[0] * resize_scale), int(img.size[1] * resize_scale))) boxes *= resize_scale plt.figure(figsize=(img.size[0]/100.0, img.size[1]/100.0)) plt.subplot(2,1,1) plt.imshow(img) img = img.convert('RGB') img = np.array(img) img = img.copy() for box in boxes: xmin, ymin, xmax, ymax = box.astype(np.int) cv2.rectangle(img, (xmin, ymin), (xmax, ymax), (0, 255, 0), thickness=3) plt.subplot(2,1,2) plt.imshow(img) plt.show()5.定义训练超参,模型、日志保存路径# 定义训练超参 num_classes = 2 # 数据集中只有 steel 一个标签,加上背景,所以总共有2个类 max_epoch = 25 # 默认值为1,调整为大于20的值,训练效果更佳 batch_size = 4 ngpu = 1 initial_lr = 0.01 img_dim = 416 # 模型输入图片大小 train_sets = [('2007', 'trainval')] # 指定训练集 cfg = VOC_Config rgb_means = (104, 117, 123) # ImageNet数据集的RGB均值 save_folder = './rebar_count/model_snapshots' # 指定训练模型保存路径 if not os.path.exists(save_folder): os.mkdir(save_folder) log_path = os.path.join('./rebar_count/logs', datetime.now().isoformat()) # 指定日志保存路径 if not os.path.exists(log_path): os.makedirs(log_path)6.构建模型,定义优化器及损失函数net = build_net('train', img_dim, num_classes=num_classes) if ngpu > 1: net = torch.nn.DataParallel(net) net.cuda() # 本案例代码只能在GPU上训练 cudnn.benchmark = True optimizer = optim.SGD(net.parameters(), lr=initial_lr, momentum=0.9, weight_decay=0) # 定义优化器 criterion = MultiBoxLoss(num_classes, overlap_thresh=0.4, prior_for_matching=True, bkg_label=0, neg_mining=True, neg_pos=3, neg_overlap=0.3, encode_target=False) # 定义损失函数 priorbox = PriorBox(cfg) with torch.no_grad(): priors = priorbox.forward() priors = priors.cuda()7.定义自适应学习率函数def adjust_learning_rate(optimizer, gamma, epoch, step_index, iteration, epoch_size): """ 自适应学习率 """ if epoch < 11: lr = 1e-8 + (initial_lr-1e-8) * iteration / (epoch_size * 10) else: lr = initial_lr * (gamma ** (step_index)) for param_group in optimizer.param_groups: param_group['lr'] = lr return lr8.定义训练函数def train(): """ 模型训练函数,每10次迭代打印一次日志,20个epoch之后,每个epoch保存一次模型 """ net.train() loc_loss = 0 conf_loss = 0 epoch = 0 print('Loading dataset...') dataset = VOCDetection(VOCroot, train_sets, preproc(img_dim, rgb_means, p=0.0), AnnotationTransform()) epoch_size = len(dataset) // batch_size max_iter = max_epoch * epoch_size stepvalues = (25 * epoch_size, 35 * epoch_size) step_index = 0 start_iter = 0 lr = initial_lr for iteration in range(start_iter, max_iter): if iteration % epoch_size == 0: if epoch > 20: torch.save(net.state_dict(), os.path.join(save_folder, 'epoch_' + repr(epoch).zfill(3) + '_loss_'+ '%.4f' % loss.item() + '.pth')) batch_iterator = iter(data.DataLoader(dataset, batch_size, shuffle=True, num_workers=1, collate_fn=detection_collate)) loc_loss = 0 conf_loss = 0 epoch += 1 load_t0 = time.time() if iteration in stepvalues: step_index += 1 lr = adjust_learning_rate(optimizer, 0.2, epoch, step_index, iteration, epoch_size) images, targets = next(batch_iterator) images = Variable(images.cuda()) targets = [Variable(anno.cuda()) for anno in targets] # forward t0 = time.time() out = net(images) # backprop optimizer.zero_grad() loss_l, loss_c = criterion(out, priors, targets) loss = loss_l + loss_c loss.backward() optimizer.step() t1 = time.time() loc_loss += loss_l.item() conf_loss += loss_c.item() load_t1 = time.time() if iteration % 10 == 0: print('Epoch:' + repr(epoch) + ' || epochiter: ' + repr(iteration % epoch_size) + '/' + repr(epoch_size) + '|| Totel iter ' + repr(iteration) + ' || L: %.4f C: %.4f||' % ( loss_l.item(),loss_c.item()) + 'Batch time: %.4f sec. ||' % (load_t1 - load_t0) + 'LR: %.8f' % (lr)) torch.save(net.state_dict(), os.path.join(save_folder, 'epoch_' + repr(epoch).zfill(3) + '_loss_'+ '%.4f' % loss.item() + '.pth'))9.开始训练,每个epoch训练耗时约60秒t1 = time.time() print('开始训练,本次训练总共需%d个epoch,每个epoch训练耗时约60秒' % max_epoch) train() print('training cost %.2f s' % (time.time() - t1))开始训练,本次训练总共需25个epoch,每个epoch训练耗时约60秒 Loading dataset... Epoch:1 || epochiter: 0/50|| Totel iter 0 || L: 3.7043 C: 3.7730||Batch time: 2.6931 sec. ||LR: 0.00000001 Epoch:1 || epochiter: 10/50|| Totel iter 10 || L: 3.1277 C: 3.1485||Batch time: 1.3692 sec. ||LR: 0.00020001 Epoch:1 || epochiter: 20/50|| Totel iter 20 || L: 3.3249 C: 2.4864||Batch time: 0.7837 sec. ||LR: 0.00040001 Epoch:1 || epochiter: 30/50|| Totel iter 30 || L: 2.8867 C: 2.4690||Batch time: 1.5246 sec. ||LR: 0.00060001 Epoch:1 || epochiter: 40/50|| Totel iter 40 || L: 2.6481 C: 2.1631||Batch time: 1.4777 sec. ||LR: 0.00080001 Epoch:2 || epochiter: 0/50|| Totel iter 50 || L: 3.0177 C: 2.1672||Batch time: 1.5618 sec. ||LR: 0.00100001 Epoch:2 || epochiter: 10/50|| Totel iter 60 || L: 1.9024 C: 1.8743||Batch time: 1.2920 sec. ||LR: 0.00120001 Epoch:2 || epochiter: 20/50|| Totel iter 70 || L: 1.5299 C: 1.7229||Batch time: 1.3726 sec. ||LR: 0.00140001 Epoch:2 || epochiter: 30/50|| Totel iter 80 || L: 1.7592 C: 1.8066||Batch time: 0.9840 sec. ||LR: 0.00160001 Epoch:2 || epochiter: 40/50|| Totel iter 90 || L: 1.4430 C: 1.7445||Batch time: 1.5012 sec. ||LR: 0.00180001 Epoch:3 || epochiter: 0/50|| Totel iter 100 || L: 1.3402 C: 1.5614||Batch time: 1.3830 sec. ||LR: 0.00200001 Epoch:3 || epochiter: 10/50|| Totel iter 110 || L: 1.2771 C: 1.7149||Batch time: 1.4420 sec. ||LR: 0.00220001 Epoch:3 || epochiter: 20/50|| Totel iter 120 || L: 2.1052 C: 2.3860||Batch time: 1.0122 sec. ||LR: 0.00240001 Epoch:3 || epochiter: 30/50|| Totel iter 130 || L: 1.3969 C: 2.0087||Batch time: 1.2500 sec. ||LR: 0.00260001 Epoch:3 || epochiter: 40/50|| Totel iter 140 || L: 1.1426 C: 1.3518||Batch time: 1.3625 sec. ||LR: 0.00280001 Epoch:4 || epochiter: 0/50|| Totel iter 150 || L: 1.3851 C: 1.3837||Batch time: 1.3933 sec. ||LR: 0.00300001 Epoch:4 || epochiter: 10/50|| Totel iter 160 || L: 0.8790 C: 1.0304||Batch time: 1.0430 sec. ||LR: 0.00320001 Epoch:4 || epochiter: 20/50|| Totel iter 170 || L: 1.1230 C: 1.2439||Batch time: 1.0029 sec. ||LR: 0.00340001 Epoch:4 || epochiter: 30/50|| Totel iter 180 || L: 1.0097 C: 1.1061||Batch time: 1.5267 sec. ||LR: 0.00360001 Epoch:4 || epochiter: 40/50|| Totel iter 190 || L: 0.8008 C: 1.0768||Batch time: 1.1727 sec. ||LR: 0.00380001 Epoch:5 || epochiter: 0/50|| Totel iter 200 || L: 1.0015 C: 1.1481||Batch time: 1.3881 sec. ||LR: 0.00400001 Epoch:5 || epochiter: 10/50|| Totel iter 210 || L: 0.9171 C: 1.1305||Batch time: 1.2255 sec. ||LR: 0.00420001 Epoch:5 || epochiter: 20/50|| Totel iter 220 || L: 0.9460 C: 1.0200||Batch time: 1.0095 sec. ||LR: 0.00440001 Epoch:5 || epochiter: 30/50|| Totel iter 230 || L: 0.8780 C: 1.1776||Batch time: 1.3224 sec. ||LR: 0.00460001 Epoch:5 || epochiter: 40/50|| Totel iter 240 || L: 0.8082 C: 0.8878||Batch time: 1.0734 sec. ||LR: 0.00480001 Epoch:6 || epochiter: 0/50|| Totel iter 250 || L: 0.7907 C: 0.9508||Batch time: 1.2835 sec. ||LR: 0.00500001 Epoch:6 || epochiter: 10/50|| Totel iter 260 || L: 0.6690 C: 0.8685||Batch time: 1.4887 sec. ||LR: 0.00520000 Epoch:6 || epochiter: 20/50|| Totel iter 270 || L: 1.1006 C: 0.9525||Batch time: 1.3324 sec. ||LR: 0.00540000 Epoch:6 || epochiter: 30/50|| Totel iter 280 || L: 0.9483 C: 1.0393||Batch time: 1.3198 sec. ||LR: 0.00560000 Epoch:6 || epochiter: 40/50|| Totel iter 290 || L: 0.8986 C: 1.0833||Batch time: 1.3434 sec. ||LR: 0.00580000 Epoch:7 || epochiter: 0/50|| Totel iter 300 || L: 0.8187 C: 0.9676||Batch time: 1.4531 sec. ||LR: 0.00600000 Epoch:7 || epochiter: 10/50|| Totel iter 310 || L: 0.6827 C: 0.9837||Batch time: 0.9223 sec. ||LR: 0.00620000 Epoch:7 || epochiter: 20/50|| Totel iter 320 || L: 0.7325 C: 0.8995||Batch time: 0.9585 sec. ||LR: 0.00640000 Epoch:7 || epochiter: 30/50|| Totel iter 330 || L: 0.9895 C: 1.0482||Batch time: 1.2272 sec. ||LR: 0.00660000 Epoch:7 || epochiter: 40/50|| Totel iter 340 || L: 0.5824 C: 0.8616||Batch time: 1.1445 sec. ||LR: 0.00680000 Epoch:8 || epochiter: 0/50|| Totel iter 350 || L: 1.1853 C: 1.2745||Batch time: 1.5200 sec. ||LR: 0.00700000 Epoch:8 || epochiter: 10/50|| Totel iter 360 || L: 0.7265 C: 1.1777||Batch time: 0.7649 sec. ||LR: 0.00720000 Epoch:8 || epochiter: 20/50|| Totel iter 370 || L: 0.7457 C: 0.8613||Batch time: 1.5218 sec. ||LR: 0.00740000 Epoch:8 || epochiter: 30/50|| Totel iter 380 || L: 0.5295 C: 0.9103||Batch time: 1.2653 sec. ||LR: 0.00760000 Epoch:8 || epochiter: 40/50|| Totel iter 390 || L: 0.7083 C: 1.0060||Batch time: 1.1069 sec. ||LR: 0.00780000 Epoch:9 || epochiter: 0/50|| Totel iter 400 || L: 0.6398 C: 0.9866||Batch time: 1.5802 sec. ||LR: 0.00800000 Epoch:9 || epochiter: 10/50|| Totel iter 410 || L: 0.5987 C: 0.8167||Batch time: 1.0675 sec. ||LR: 0.00820000 Epoch:9 || epochiter: 20/50|| Totel iter 420 || L: 0.5751 C: 0.7944||Batch time: 0.7669 sec. ||LR: 0.00840000 Epoch:9 || epochiter: 30/50|| Totel iter 430 || L: 0.7229 C: 1.0396||Batch time: 1.3895 sec. ||LR: 0.00860000 Epoch:9 || epochiter: 40/50|| Totel iter 440 || L: 0.5569 C: 0.9122||Batch time: 0.8300 sec. ||LR: 0.00880000 Epoch:10 || epochiter: 0/50|| Totel iter 450 || L: 0.6908 C: 0.9928||Batch time: 1.4029 sec. ||LR: 0.00900000 Epoch:10 || epochiter: 10/50|| Totel iter 460 || L: 0.6851 C: 0.8068||Batch time: 1.2804 sec. ||LR: 0.00920000 Epoch:10 || epochiter: 20/50|| Totel iter 470 || L: 0.6783 C: 0.8511||Batch time: 1.7469 sec. ||LR: 0.00940000 Epoch:10 || epochiter: 30/50|| Totel iter 480 || L: 0.7962 C: 0.8040||Batch time: 1.6116 sec. ||LR: 0.00960000 Epoch:10 || epochiter: 40/50|| Totel iter 490 || L: 0.7782 C: 0.9469||Batch time: 1.1979 sec. ||LR: 0.00980000 Epoch:11 || epochiter: 0/50|| Totel iter 500 || L: 0.8902 C: 0.8956||Batch time: 1.8625 sec. ||LR: 0.01000000 Epoch:11 || epochiter: 10/50|| Totel iter 510 || L: 0.8532 C: 0.9259||Batch time: 1.2692 sec. ||LR: 0.01000000 Epoch:11 || epochiter: 20/50|| Totel iter 520 || L: 0.7917 C: 0.7990||Batch time: 1.7494 sec. ||LR: 0.01000000 Epoch:11 || epochiter: 30/50|| Totel iter 530 || L: 0.9688 C: 1.2376||Batch time: 1.1547 sec. ||LR: 0.01000000 Epoch:11 || epochiter: 40/50|| Totel iter 540 || L: 0.7030 C: 0.8440||Batch time: 1.1588 sec. ||LR: 0.01000000 Epoch:12 || epochiter: 0/50|| Totel iter 550 || L: 0.6580 C: 0.8380||Batch time: 1.2196 sec. ||LR: 0.01000000 Epoch:12 || epochiter: 10/50|| Totel iter 560 || L: 0.7978 C: 0.8454||Batch time: 1.1011 sec. ||LR: 0.01000000 Epoch:12 || epochiter: 20/50|| Totel iter 570 || L: 0.6071 C: 0.8394||Batch time: 0.7146 sec. ||LR: 0.01000000 Epoch:12 || epochiter: 30/50|| Totel iter 580 || L: 0.4787 C: 0.6888||Batch time: 1.2482 sec. ||LR: 0.01000000 Epoch:12 || epochiter: 40/50|| Totel iter 590 || L: 0.6505 C: 0.8412||Batch time: 1.1304 sec. ||LR: 0.01000000 Epoch:13 || epochiter: 0/50|| Totel iter 600 || L: 0.6316 C: 0.8319||Batch time: 1.4268 sec. ||LR: 0.01000000 Epoch:13 || epochiter: 10/50|| Totel iter 610 || L: 0.6693 C: 0.7822||Batch time: 1.2204 sec. ||LR: 0.01000000 Epoch:13 || epochiter: 20/50|| Totel iter 620 || L: 0.6773 C: 0.9631||Batch time: 1.2477 sec. ||LR: 0.01000000 Epoch:13 || epochiter: 30/50|| Totel iter 630 || L: 0.4851 C: 0.8346||Batch time: 1.2228 sec. ||LR: 0.01000000 Epoch:13 || epochiter: 40/50|| Totel iter 640 || L: 0.7247 C: 0.9392||Batch time: 1.2318 sec. ||LR: 0.01000000 Epoch:14 || epochiter: 0/50|| Totel iter 650 || L: 0.5716 C: 0.7683||Batch time: 1.8367 sec. ||LR: 0.01000000 Epoch:14 || epochiter: 10/50|| Totel iter 660 || L: 0.7804 C: 1.0285||Batch time: 1.0683 sec. ||LR: 0.01000000 Epoch:14 || epochiter: 20/50|| Totel iter 670 || L: 0.4620 C: 0.8179||Batch time: 1.3811 sec. ||LR: 0.01000000 Epoch:14 || epochiter: 30/50|| Totel iter 680 || L: 0.5459 C: 0.7611||Batch time: 1.4473 sec. ||LR: 0.01000000 Epoch:14 || epochiter: 40/50|| Totel iter 690 || L: 0.4946 C: 0.7604||Batch time: 1.2968 sec. ||LR: 0.01000000 Epoch:15 || epochiter: 0/50|| Totel iter 700 || L: 0.6467 C: 0.6637||Batch time: 1.4271 sec. ||LR: 0.01000000 Epoch:15 || epochiter: 10/50|| Totel iter 710 || L: 0.4383 C: 0.6140||Batch time: 1.1232 sec. ||LR: 0.01000000 Epoch:15 || epochiter: 20/50|| Totel iter 720 || L: 0.5551 C: 0.9027||Batch time: 1.2992 sec. ||LR: 0.01000000 Epoch:15 || epochiter: 30/50|| Totel iter 730 || L: 0.4488 C: 0.7574||Batch time: 0.9148 sec. ||LR: 0.01000000 Epoch:15 || epochiter: 40/50|| Totel iter 740 || L: 0.5179 C: 0.6202||Batch time: 1.5350 sec. ||LR: 0.01000000 Epoch:16 || epochiter: 0/50|| Totel iter 750 || L: 0.4956 C: 0.6740||Batch time: 1.6760 sec. ||LR: 0.01000000 Epoch:16 || epochiter: 10/50|| Totel iter 760 || L: 0.5780 C: 0.8834||Batch time: 1.3318 sec. ||LR: 0.01000000 Epoch:16 || epochiter: 20/50|| Totel iter 770 || L: 0.5829 C: 0.7340||Batch time: 1.0279 sec. ||LR: 0.01000000 Epoch:16 || epochiter: 30/50|| Totel iter 780 || L: 0.4798 C: 0.7019||Batch time: 1.4545 sec. ||LR: 0.01000000 Epoch:16 || epochiter: 40/50|| Totel iter 790 || L: 0.6511 C: 0.7712||Batch time: 1.7330 sec. ||LR: 0.01000000 Epoch:17 || epochiter: 0/50|| Totel iter 800 || L: 0.4281 C: 0.6578||Batch time: 1.6699 sec. ||LR: 0.01000000 Epoch:17 || epochiter: 10/50|| Totel iter 810 || L: 0.5440 C: 0.7102||Batch time: 1.4820 sec. ||LR: 0.01000000 Epoch:17 || epochiter: 20/50|| Totel iter 820 || L: 0.4770 C: 0.7014||Batch time: 1.4020 sec. ||LR: 0.01000000 Epoch:17 || epochiter: 30/50|| Totel iter 830 || L: 0.3601 C: 0.5890||Batch time: 1.0758 sec. ||LR: 0.01000000 Epoch:17 || epochiter: 40/50|| Totel iter 840 || L: 0.4817 C: 0.7329||Batch time: 1.3797 sec. ||LR: 0.01000000 Epoch:18 || epochiter: 0/50|| Totel iter 850 || L: 0.4860 C: 0.7499||Batch time: 1.3214 sec. ||LR: 0.01000000 Epoch:18 || epochiter: 10/50|| Totel iter 860 || L: 0.6856 C: 0.7154||Batch time: 1.4014 sec. ||LR: 0.01000000 Epoch:18 || epochiter: 20/50|| Totel iter 870 || L: 0.6231 C: 0.7692||Batch time: 0.9905 sec. ||LR: 0.01000000 Epoch:18 || epochiter: 30/50|| Totel iter 880 || L: 0.6680 C: 0.8625||Batch time: 1.1373 sec. ||LR: 0.01000000 Epoch:18 || epochiter: 40/50|| Totel iter 890 || L: 0.5535 C: 0.7393||Batch time: 1.1122 sec. ||LR: 0.01000000 Epoch:19 || epochiter: 0/50|| Totel iter 900 || L: 0.4691 C: 0.7235||Batch time: 1.3488 sec. ||LR: 0.01000000 Epoch:19 || epochiter: 10/50|| Totel iter 910 || L: 0.6145 C: 0.7811||Batch time: 1.1163 sec. ||LR: 0.01000000 Epoch:19 || epochiter: 20/50|| Totel iter 920 || L: 0.4698 C: 0.7225||Batch time: 1.6120 sec. ||LR: 0.01000000 Epoch:19 || epochiter: 30/50|| Totel iter 930 || L: 0.5623 C: 0.7341||Batch time: 1.3949 sec. ||LR: 0.01000000 Epoch:19 || epochiter: 40/50|| Totel iter 940 || L: 0.4859 C: 0.5786||Batch time: 0.8949 sec. ||LR: 0.01000000 Epoch:20 || epochiter: 0/50|| Totel iter 950 || L: 0.4193 C: 0.6898||Batch time: 1.4702 sec. ||LR: 0.01000000 Epoch:20 || epochiter: 10/50|| Totel iter 960 || L: 0.4434 C: 0.6261||Batch time: 1.0974 sec. ||LR: 0.01000000 Epoch:20 || epochiter: 20/50|| Totel iter 970 || L: 0.5948 C: 0.8787||Batch time: 1.1951 sec. ||LR: 0.01000000 Epoch:20 || epochiter: 30/50|| Totel iter 980 || L: 0.5842 C: 0.6120||Batch time: 0.9863 sec. ||LR: 0.01000000 Epoch:20 || epochiter: 40/50|| Totel iter 990 || L: 0.4010 C: 0.7356||Batch time: 1.5981 sec. ||LR: 0.01000000 Epoch:21 || epochiter: 0/50|| Totel iter 1000 || L: 0.4719 C: 0.6351||Batch time: 1.1228 sec. ||LR: 0.01000000 Epoch:21 || epochiter: 10/50|| Totel iter 1010 || L: 0.5856 C: 0.7444||Batch time: 1.3812 sec. ||LR: 0.01000000 Epoch:21 || epochiter: 20/50|| Totel iter 1020 || L: 0.5810 C: 0.8371||Batch time: 1.2560 sec. ||LR: 0.01000000 Epoch:21 || epochiter: 30/50|| Totel iter 1030 || L: 0.4583 C: 0.9570||Batch time: 1.1499 sec. ||LR: 0.01000000 Epoch:21 || epochiter: 40/50|| Totel iter 1040 || L: 0.5411 C: 0.5317||Batch time: 1.4007 sec. ||LR: 0.01000000 Epoch:22 || epochiter: 0/50|| Totel iter 1050 || L: 0.3508 C: 0.5599||Batch time: 1.1371 sec. ||LR: 0.01000000 Epoch:22 || epochiter: 10/50|| Totel iter 1060 || L: 0.4045 C: 0.6965||Batch time: 1.1030 sec. ||LR: 0.01000000 Epoch:22 || epochiter: 20/50|| Totel iter 1070 || L: 0.3949 C: 0.6019||Batch time: 1.4505 sec. ||LR: 0.01000000 Epoch:22 || epochiter: 30/50|| Totel iter 1080 || L: 0.3467 C: 0.5563||Batch time: 1.1956 sec. ||LR: 0.01000000 Epoch:22 || epochiter: 40/50|| Totel iter 1090 || L: 0.5757 C: 0.5643||Batch time: 0.8669 sec. ||LR: 0.01000000 Epoch:23 || epochiter: 0/50|| Totel iter 1100 || L: 0.3946 C: 0.6081||Batch time: 1.7117 sec. ||LR: 0.01000000 Epoch:23 || epochiter: 10/50|| Totel iter 1110 || L: 0.3655 C: 0.5579||Batch time: 0.9830 sec. ||LR: 0.01000000 Epoch:23 || epochiter: 20/50|| Totel iter 1120 || L: 0.3912 C: 0.6437||Batch time: 1.2725 sec. ||LR: 0.01000000 Epoch:23 || epochiter: 30/50|| Totel iter 1130 || L: 0.4237 C: 0.6337||Batch time: 1.3346 sec. ||LR: 0.01000000 Epoch:23 || epochiter: 40/50|| Totel iter 1140 || L: 0.3474 C: 0.5517||Batch time: 1.1646 sec. ||LR: 0.01000000 Epoch:24 || epochiter: 0/50|| Totel iter 1150 || L: 0.5573 C: 0.7426||Batch time: 1.5291 sec. ||LR: 0.01000000 Epoch:24 || epochiter: 10/50|| Totel iter 1160 || L: 0.6122 C: 0.6805||Batch time: 1.1861 sec. ||LR: 0.01000000 Epoch:24 || epochiter: 20/50|| Totel iter 1170 || L: 0.3846 C: 0.6484||Batch time: 1.2575 sec. ||LR: 0.01000000 Epoch:24 || epochiter: 30/50|| Totel iter 1180 || L: 0.4183 C: 0.6982||Batch time: 1.1318 sec. ||LR: 0.01000000 Epoch:24 || epochiter: 40/50|| Totel iter 1190 || L: 0.5259 C: 0.7322||Batch time: 1.0091 sec. ||LR: 0.01000000 Epoch:25 || epochiter: 0/50|| Totel iter 1200 || L: 0.4047 C: 0.5544||Batch time: 1.4809 sec. ||LR: 0.01000000 Epoch:25 || epochiter: 10/50|| Totel iter 1210 || L: 0.4519 C: 0.5351||Batch time: 1.2974 sec. ||LR: 0.01000000 Epoch:25 || epochiter: 20/50|| Totel iter 1220 || L: 0.4390 C: 0.6232||Batch time: 1.0032 sec. ||LR: 0.01000000 Epoch:25 || epochiter: 30/50|| Totel iter 1230 || L: 0.4840 C: 0.7323||Batch time: 1.0048 sec. ||LR: 0.01000000 Epoch:25 || epochiter: 40/50|| Totel iter 1240 || L: 0.6699 C: 0.8887||Batch time: 1.7034 sec. ||LR: 0.01000000 training cost 1572.48 s10.已完成训练,下面开始测试模型,首先需定义目标检测类cfg = VOC_Config img_dim = 416 rgb_means = (104, 117, 123) priorbox = PriorBox(cfg) with torch.no_grad(): priors = priorbox.forward() if torch.cuda.is_available(): priors = priors.cuda() class ObjectDetector: """ 定义目标检测类 """ def __init__(self, net, detection, transform, num_classes=num_classes, thresh=0.01, cuda=True): self.net = net self.detection = detection self.transform = transform self.num_classes = num_classes self.thresh = thresh self.cuda = torch.cuda.is_available() def predict(self, img): _t = {'im_detect': Timer(), 'misc': Timer()} scale = torch.Tensor([img.shape[1], img.shape[0], img.shape[1], img.shape[0]]) with torch.no_grad(): x = self.transform(img).unsqueeze(0) if self.cuda: x = x.cuda() scale = scale.cuda() _t['im_detect'].tic() out = net(x) # forward pass boxes, scores = self.detection.forward(out, priors) detect_time = _t['im_detect'].toc() boxes = boxes[0] scores = scores[0] # scale each detection back up to the image boxes *= scale boxes = boxes.cpu().numpy() scores = scores.cpu().numpy() _t['misc'].tic() all_boxes = [[] for _ in range(num_classes)] for j in range(1, num_classes): inds = np.where(scores[:, j] > self.thresh)[0] if len(inds) == 0: all_boxes[j] = np.zeros([0, 5], dtype=np.float32) continue c_bboxes = boxes[inds] c_scores = scores[inds, j] c_dets = np.hstack((c_bboxes, c_scores[:, np.newaxis])).astype( np.float32, copy=False) keep = nms(c_dets, 0.2, force_cpu=False) c_dets = c_dets[keep, :] all_boxes[j] = c_dets nms_time = _t['misc'].toc() total_time = detect_time + nms_time return all_boxes, total_time11.定义推理网络,并加载前面训练的loss最低的模型trained_models = os.listdir(os.path.join(ROOT_DIR, './rebar_count/model_snapshots')) # 模型文件所在目录 lowest_loss = 9999 best_model_name = '' for model_name in trained_models: if not model_name.endswith('pth'): continue loss = float(model_name.split('_loss_')[1].split('.pth')[0]) if loss < lowest_loss: lowest_loss = loss best_model_name = model_name best_model_path = os.path.join(ROOT_DIR, './rebar_count/model_snapshots', best_model_name) print('loading model from', best_model_path) net = build_net('test', img_dim, num_classes) # 加载模型 state_dict = torch.load(best_model_path) new_state_dict = OrderedDict() for k, v in state_dict.items(): head = k[:7] if head == 'module.': name = k[7:] else: name = k new_state_dict[name] = v net.load_state_dict(new_state_dict) net.eval() print('Finish load model!') if torch.cuda.is_available(): net = net.cuda() cudnn.benchmark = True else: net = net.cpu() detector = Detect(num_classes, 0, cfg) transform = BaseTransform(img_dim, rgb_means, (2, 0, 1)) object_detector = ObjectDetector(net, detector, transform)loading model from /home/ma-user/work/./rebar_count/model_snapshots/epoch_023_loss_1.0207.pth Finish load model!12.测试图片,输出每条钢筋的位置和图片中钢筋总条数test_img_dir = r'./rebar_count/datasets/test_dataset' # 待预测的图片目录 files = os.listdir(test_img_dir) files.sort() for i, file_name in enumerate(files[:2]): image_src = cv2.imread(os.path.join(test_img_dir, file_name)) detect_bboxes, tim = object_detector.predict(image_src) image_draw = image_src.copy() rebar_count = 0 for class_id, class_collection in enumerate(detect_bboxes): if len(class_collection) > 0: for i in range(class_collection.shape[0]): if class_collection[i, -1] > 0.6: pt = class_collection[i] cv2.circle(image_draw, (int((pt[0] + pt[2]) * 0.5), int((pt[1] + pt[3]) * 0.5)), int((pt[2] - pt[0]) * 0.5 * 0.6), (255, 0, 0), -1) rebar_count += 1 cv2.putText(image_draw, 'rebar_count: %d' % rebar_count, (25, 50), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 255, 0), 3) plt.figure(i, figsize=(30, 20)) plt.imshow(image_draw) plt.show()至此,本案例结束。
  • [技术干货] 物体检测YOLOv3实践
    物体检测YOLOv3实践物体检测是计算机视觉中的一个重要的研究领域,在人流检测,行人跟踪,自动驾驶,医学影像等领域有着广泛的应用。不同于简单的图像分类,物体检测旨在对图像中的目标进行精确识别,包括物体的位置和分类,因此能够应用于更多高层视觉处理的场景。例如在自动驾驶领域,需要辨识摄像头拍摄的图像中的车辆、行人、交通指示牌及其位置,以便进一步根据这些数据决定驾驶策略。本期学习案例,我们将聚焦于YOLO算法,YOLO(You Only Look Once)是一种one-stage物体检测算法。注意事项:本案例使用框架: TensorFlow-1.13.1本案例使用硬件规格: GPU V100进入运行环境方法:点此链接进入AI Gallery,点击Run in ModelArts按钮进入ModelArts运行环境,如需使用GPU,您可以在ModelArts JupyterLab运行界面右边的工作区进行切换运行代码方法: 点击本页面顶部菜单栏的三角形运行按钮或按Ctrl+Enter键 运行每个方块中的代码JupyterLab的详细用法: 请参考《ModelAtrs JupyterLab使用指导》碰到问题的解决办法: 请参考《ModelAtrs JupyterLab常见问题解决办法》1.数据和代码下载运行下面代码,进行数据和代码的下载和解压本案例使用coco数据,共80个类别。import os from modelarts.session import Session sess = Session() if sess.region_name == 'cn-north-1': bucket_path="modelarts-labs/notebook/DL_object_detection_yolo/yolov3.tar.gz" elif sess.region_name == 'cn-north-4': bucket_path="modelarts-labs-bj4/notebook/DL_object_detection_yolo/yolov3.tar.gz" else: print("请更换地区到北京一或北京四") if not os.path.exists('./yolo3'): sess.download_data(bucket_path=bucket_path, path="./yolov3.tar.gz") if os.path.exists('./yolov3.tar.gz'): # 解压文件 os.system("tar -xf ./yolov3.tar.gz") # 清理压缩包 os.system("rm -r ./yolov3.tar.gz")2.准备数据2.1文件路径定义from train import get_classes, get_anchors # 数据文件路径 data_path = "./coco/coco_data" # coco类型定义文件存储位置 classes_path = './model_data/coco_classes.txt' # coco数据anchor值文件存储位置 anchors_path = './model_data/yolo_anchors.txt' # coco数据标注信息文件存储位置 annotation_path = './coco/coco_train.txt' # 预训练权重文件存储位置 weights_path = "./model_data/yolo.h5" # 模型文件存储位置 save_path = "./result/models/" classes = get_classes(classes_path) anchors = get_anchors(anchors_path) # 获取类型数量和anchor数量变量 num_classes = len(classes) num_anchors = len(anchors)Using TensorFlow backend. /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)])2.2读取标注数据import numpy as np # 训练集与验证集划分比例 val_split = 0.1 with open(annotation_path) as f: lines = f.readlines() np.random.seed(10101) np.random.shuffle(lines) np.random.seed(None) num_val = int(len(lines)*val_split) num_train = len(lines) - num_val2.3数据读取函数,构建数据生成器。每次读取一个批次的数据至内存训练,并做数据增强。def data_generator(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes): n = len(annotation_lines) i = 0 while True: image_data = [] box_data = [] for b in range(batch_size): if i==0: np.random.shuffle(annotation_lines) image, box = get_random_data(annotation_lines[i], input_shape, data_path,random=True) # 随机挑选一个批次的数据 image_data.append(image) box_data.append(box) i = (i+1) % n image_data = np.array(image_data) box_data = np.array(box_data) y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes) # 对标注框预处理,过滤异常标注框 yield [image_data, *y_true], np.zeros(batch_size) def data_generator_wrapper(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes): n = len(annotation_lines) if n==0 or batch_size<=0: return None return data_generator(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes)3.模型训练本案例使用Keras深度学习框架搭建YOLOv3神经网络。可以进入相应的文件夹路径查看源码实现。3.1构建神经网络可以在./yolo3/model.py文件中查看细节import keras.backend as K from yolo3.model import preprocess_true_boxes, yolo_body, yolo_loss from keras.layers import Input, Lambda from keras.models import Model # 初始化session K.clear_session() # 图像输入尺寸 input_shape = (416, 416) image_input = Input(shape=(None, None, 3)) h, w = input_shape # 设置多尺度检测的下采样尺寸 y_true = [Input(shape=(h//{0:32, 1:16, 2:8}[l], w//{0:32, 1:16, 2:8}[l], num_anchors//3, num_classes+5)) for l in range(3)] # 构建YOLO模型结构 model_body = yolo_body(image_input, num_anchors//3, num_classes) # 将YOLO权重文件加载进来,如果希望不加载预训练权重,从头开始训练的话,可以删除这句代码 model_body.load_weights(weights_path, by_name=True, skip_mismatch=True) # 定义YOLO损失函数 model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss', arguments={'anchors': anchors, 'num_classes': num_classes, 'ignore_thresh': 0.5})([*model_body.output, *y_true]) # 构建Model,为训练做准备 model = Model([model_body.input, *y_true], model_loss)WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer.# 打印模型各层结构 model.summary()(此处代码执行的输出很长,省略)训练回调函数定义from keras.callbacks import ReduceLROnPlateau, EarlyStopping # 定义回调方法 reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1) # 学习率衰减策略 early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1) # 早停策略3.2开始训练from keras.optimizers import Adam from yolo3.utils import get_random_data # 设置所有的层可训练 for i in range(len(model.layers)): model.layers[i].trainable = True # 选择Adam优化器,设置学习率 learning_rate = 1e-4 model.compile(optimizer=Adam(lr=learning_rate), loss={'yolo_loss': lambda y_true, y_pred: y_pred}) # 设置批大小和训练轮数 batch_size = 16 max_epochs = 2 print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size)) # 开始训练 model.fit_generator(data_generator_wrapper(lines[:num_train], batch_size, input_shape, data_path,anchors, num_classes), steps_per_epoch=max(1, num_train//batch_size), validation_data=data_generator_wrapper(lines[num_train:], batch_size, input_shape, data_path,anchors, num_classes), validation_steps=max(1, num_val//batch_size), epochs=max_epochs, initial_epoch=0, callbacks=[reduce_lr, early_stopping])Train on 179 samples, val on 19 samples, with batch size 16. Epoch 1/2 11/11 [==============================] - 25s 2s/step - loss: 46.6694 - val_loss: 39.1381 Epoch 2/2 11/11 [==============================] - 5s 452ms/step - loss: 45.5145 - val_loss: 43.6707 <keras.callbacks.History at 0x7fbff60659e8>3.3保存模型import os if not os.path.exists(save_path): os.makedirs(save_path) # 保存模型 model.save_weights(os.path.join(save_path, 'trained_weights_final.h5'))4.模型测试4.1打开一张测试图片from PIL import Image import numpy as np # 测试文件路径 test_file_path = './test.jpg' # 打开测试文件 image = Image.open(test_file_path) image_ori = np.array(image) image_ori.shape(640, 481, 3)4.2图片预处理from yolo3.utils import letterbox_image new_image_size = (image.width - (image.width % 32), image.height - (image.height % 32)) boxed_image = letterbox_image(image, new_image_size) image_data = np.array(boxed_image, dtype='float32') image_data /= 255. image_data = np.expand_dims(image_data, 0) image_data.shape(1, 640, 480, 3)import keras.backend as K sess = K.get_session()4.3构建模型from yolo3.model import yolo_body from keras.layers import Input # coco数据anchor值文件存储位置 anchor_path = "./model_data/yolo_anchors.txt" with open(anchor_path) as f: anchors = f.readline() anchors = [float(x) for x in anchors.split(',')] anchors = np.array(anchors).reshape(-1, 2) yolo_model = yolo_body(Input(shape=(None,None,3)), len(anchors)//3, num_classes)4.4加载模型权重,或将模型路径替换成上一步训练得出的模型路径# 模型权重存储路径 weights_path = "./model_data/yolo.h5" yolo_model.load_weights(weights_path)4.5定义IOU以及score:IOU: 将交并比大于IOU的边界框作为冗余框去除score:将预测分数大于score的边界框筛选出来iou = 0.45 score = 0.84.6构建输出[boxes, scores, classes]from yolo3.model import yolo_eval input_image_shape = K.placeholder(shape=(2, )) boxes, scores, classes = yolo_eval( yolo_model.output, anchors, num_classes, input_image_shape, score_threshold=score, iou_threshold=iou)4.7进行预测out_boxes, out_scores, out_classes = sess.run( [boxes, scores, classes], feed_dict={ yolo_model.input: image_data, input_image_shape: [image.size[1], image.size[0]], K.learning_phase(): 0 })class_coco = get_classes(classes_path) out_coco = [] for i in out_classes: out_coco.append(class_coco[i])print(out_boxes) print(out_scores) print(out_coco)[[152.69937 166.2726 649.0503 459.9374 ] [ 68.62158 21.843088 465.66208 452.6878 ]] [0.9838943 0.999688 ] ['person', 'umbrella']4.8将预测结果绘制在图片上from PIL import Image, ImageFont, ImageDraw font = ImageFont.truetype(font='font/FiraMono-Medium.otf', size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32')) thickness = (image.size[0] + image.size[1]) // 300 for i, c in reversed(list(enumerate(out_coco))): predicted_class = c box = out_boxes[i] score = out_scores[i] label = '{} {:.2f}'.format(predicted_class, score) draw = ImageDraw.Draw(image) label_size = draw.textsize(label, font) top, left, bottom, right = box top = max(0, np.floor(top + 0.5).astype('int32')) left = max(0, np.floor(left + 0.5).astype('int32')) bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32')) right = min(image.size[0], np.floor(right + 0.5).astype('int32')) print(label, (left, top), (right, bottom)) if top - label_size[1] >= 0: text_origin = np.array([left, top - label_size[1]]) else: text_origin = np.array([left, top + 1]) for i in range(thickness): draw.rectangle( [left + i, top + i, right - i, bottom - i], outline=225) draw.rectangle( [tuple(text_origin), tuple(text_origin + label_size)], fill=225) draw.text(text_origin, label, fill=(0, 0, 0), font=font) del drawumbrella 1.00 (22, 69) (453, 466) person 0.98 (166, 153) (460, 640)image
  • 新颖的图卷积神经网络(gCNN)
    西北大学(Northwestern University)的研究人员开发了一种新颖的图卷积神经网络(gCNN),用于分析局部解剖形状和预测 Gf。皮质带和皮质下结构的形态学信息是从两个独立队列中的 T1 加权 MRI 中提取的,即儿童青少年大脑认知发展研究(ABCD;年龄:9.93 ± 0.62 岁)和人类连接组计划(HCP;年龄:28.81 ± 3.70 岁)。将皮质和皮质下表面结合在一起的预测为 ABCD(R = 0.314)和 HCP 数据集(R = 0.454)产生了最高精度的 Gf,优于历史文献中任何其他大脑测量对 Gf 的最新预测。在这两个数据集中,杏仁核、海马体和伏隔核的形态以及颞叶、顶叶和扣带皮质的形态一致地推动了对 Gf 的预测,这表明大脑形态和 Gf 之间的关系发生了重大的重构,包括与奖赏/厌恶处理、判断和决策、动机和情绪有关的系统。该研究以「A multicohort geometric deep learning study of age dependent cortical and subcortical morphologic interactions for fluid intelligence prediction」为题,于 2022 年 10 月 22 日发布在《Scientific Reports》。论文链接:https://www.nature.com/articles/s41598-022-22313-x了解智力的神经基础是一个长期的研究领域,历史上旨在识别涉及各种人类行为的大脑区域,特别是认知任务。Pioneering 和 Binet 的开创性工作发现,从命名对象到定义单词、绘图和解决类比,人类在各种任务中的行为总是不同的。Spearman 将这些观察结果综合到广义智力因素 g 的假设中,从而将人类行为与大脑功能联系起来,这反映了抽象思维,包括获取知识、适应新事物、开发抽象模型以及从学校教育和学习经验中受益的能力。Cattell 的进一步工作将 g 分为流体智能(Gf),即解决新问题和抽象推理的能力,以及与积累知识有关的结晶智能(Gc)。尽管 Gc 和 Gf 相关并在儿童期直至青春期迅速发展,但 Gf 在延迟下降之前的第三个十年达到稳定状态,而 Gc 在整个生命周期中继续发展。其中,Gf 已被证明与大量认知活动呈正相关,并且是教育和职业成功的重要预测指标。Gf 的这些高风险效应需要更好地了解其神经基质,首先要了解其神经解剖学基础。然而,如何找到大脑形态与 Gf 之间的关系仍不清楚。以前试图了解 Gf 的神经基质的工作集中在广泛的神经成像模式和病变模型上,每一种都有其局限性。例如,认知任务的功能成像研究,或血氧水平依赖(BOLO)信号的静息状态振荡之间的同步研究,集中在额顶叶网络,负责以顶叶整合理论的形式整合感觉和执行功能(P-FIT)。或者,结合脑损伤和成像分析的工作探索了多需求(MD)系统如何有助于 Gf。此外,独立于脑损伤的结构成像(即形态测量)也评估了大脑大小与 Gf 之间的相关性,或评估了特定皮质区域和白质纤维束对 Gf 的影响,但没有理论框架。使用这些成像方法,以前的研究已经确定了 Gf 和皮质形态之间的关联,例如皮质厚度、皮质面积、皮质体积、脑回和灰质密度。然而,没有研究皮层下结构的相对影响,也没有研究皮层下和额顶叶网络之外的大脑皮层区域之间的关系,例如颞叶皮层,这与一些基于洞察力的问题解决的适应性过程有关。在整个生命早期,神经变化如何与 Gf 相关非常重要,因为它提供了有关大脑成熟和衰老过程的有价值信息,并提供了对认知障碍的生理原因的洞察。研究人员发现 Gf 与年龄相关的强烈下降最近被归因于额叶皮层的白质差异。此外,Kievit 团队认为这些与年龄相关的变化是由灰质体积和前钳介导的。然而,由于神经系统的个体差异和复杂的与年龄相关的大脑变化,在这个问题上还没有达成共识。最近,形状分析已显示出通过分析表面几何特性来检测跨年龄和行为特征组的结构差异的前景。至关重要的是,这些差异通常无法通过体积变化或灰质变化来检测。因此,基于表面的方法可能对与人类行为和认知功能相关的细微大脑变化更敏感。此外,新皮质扩大主要取决于表面积的增长,这使得在考虑具有显著年龄差异的群组之间的相似性时,皮质和皮质下表面的测量很重要。因此,本研究将开发一种基于表面的方法来识别不同年龄组与 Gf 相关的大脑形态测量的一致和独特特征。来源:https://www.frontiersin.org/articles/10.3389/fnagi.2022.895535/full?&utm_source=Email_to_authors_&utm_medium=Email&utm_content=T1_11.5e1_author&utm_campaign=Email_publication&field=&journalName=Frontiers_in_Aging_Neuroscience&id=895535
  • [API使用] MindSpore Callback方法是否提供网络输出结果?
    run_context.original_args()中是否能获取训练输出的结果,例如是否能输出语义分割的pred_label数组,想用这些结果计算训练阶段的precision、recall等,注意是训练阶段的指标,而不是验证阶段。cb_param.net_outputs只能得到loss值。或者有计算训练阶段指标的其他方法吗?
  • [网络构建] Mindspore网络构建
    网络构建神经网络模型是由神经网络层和Tensor操作构成的,mindspore.nn提供了常见神经网络层的实现,在MindSpore中,Cell类是构建所有网络的基类,也是网络的基本单元。一个神经网络模型表示为一个Cell,它由不同的子Cell构成。使用这样的嵌套结构,可以简单地使用面向对象编程的思维,对神经网络结构进行构建和管理。构建Mnist数据集分类的神经网洛import mindsporefrom mindspore import nn, ops个人理解:在代码层面也就是直接调用模块,通过模块来实现我们想要达成的效果。定义模型类定义神经网络时,可以继承nn.Cell类,在__init__方法中进行子Cell的实例化和状态管理,在construct方法中实现Tensor操作。class Network(nn.Cell): def __init__(self): super().__init__() self.flatten = nn.Flatten() self.dense_relu_sequential = nn.SequentialCell( nn.Dense(28*28, 512), nn.ReLU(), nn.Dense(512, 512), nn.ReLU(), nn.Dense(512, 10) ) def construct(self, x): x = self.flatten(x) logits = self.dense_relu_sequential(x) return logits #构建完成后,实例化Network对象,并查看其结构。 model = Network()print(model)Network< (flatten): Flatten<> (dense_relu_sequential): SequentialCell< (0): Dense (1): ReLU<> (2): Dense (3): ReLU<> (4): Dense >> #我们构造一个输入数据,直接调用模型,可以获得一个10维的Tensor输出,其包含每个类别的原始预测值。X = ops.ones((1, 28, 28), mindspore.float32)logits = model(X)print(logits) pred_probab = nn.Softmax(axis=1)(logits)y_pred = pred_probab.argmax(1)print(f"Predicted class: {y_pred}")模型层分解上节构造的神经网络模型中的每一层。input_image = ops.ones((5, 15, 18), mindspore.float32)print(input_image.shape) #输出结果 (5, 15, 18) #nn.Flatten层的实例化 flatten = nn.Flatten()flat_image = flatten(input_image)print(flat_image.shape) #nn.Dense全链层,权重和偏差对输入进行线性变换layer1 = nn.Dense(in_channels=20*20, out_channels=20)hidden1 = layer1(flat_image)print(hidden1.shape) #nn.ReLU层,网络中加入非线性的激活函数print(f"Before ReLU: {hidden1}\n\n")hidden1 = nn.ReLU()(hidden1)print(f"After ReLU: {hidden1}") #nn.SequentialCell容器配置seq_modules = nn.SequentialCell( flatten, layer1, nn.ReLU(), nn.Dense(15, 10)) logits = seq_modules(input_image)print(logits.shape) #nn.Softmax全链层返回的值进行预测softmax = nn.Softmax(axis=1)pred_probab = softmax(logits) 参数模型网络内部神经网络层具有权重参数和偏置参数print(f"Model structure: {model}\n\n") for name, param in model.parameters_and_names(): print(f"Layer: {name}\nSize: {param.shape}\nValues : {param[:2]} \n")内置神经网络(mindspore.nn)1.基本构成单元接口名概述mindspore.nn.CellMindSpore中神经网络的基本构成单元。mindspore.nn.GraphCell运行从MindIR加载的计算图。mindspore.nn.LossBase损失函数的基类。mindspore.nn.Optimizer用于参数更新的优化器基类。2.循环神经网络层接口名概述mindspore.nn.RNN循环神经网络(RNN)层,其使用的激活函数为tanh或relu。mindspore.nn.RNNCell循环神经网络单元,激活函数是tanh或relu。mindspore.nn.GRUGRU(Gate Recurrent Unit)称为门控循环单元网络,是循环神经网络(Recurrent Neural Network, RNN)的一种。mindspore.nn.GRUCellGRU(Gate Recurrent Unit)称为门控循环单元。mindspore.nn.LSTM长短期记忆(LSTM)网络,根据输出序列和给定的初始状态计算输出序列和最终状态。mindspore.nn.LSTMCell长短期记忆网络单元(LSTMCell)。3.嵌入层接口名概述mindspore.nn.Embedding嵌入层。mindspore.nn.EmbeddingLookup嵌入查找层。mindspore.nn.MultiFieldEmbeddingLookup根据指定的索引和字段ID,返回输入Tensor的切片。4.池化层接口名概述mindspore.nn.AdaptiveAvgPool1d对输入的多维数据进行一维平面上的自适应平均池化运算。mindspore.nn.AdaptiveAvgPool2d二维自适应平均池化。mindspore.nn.AdaptiveAvgPool3d三维自适应平均池化。mindspore.nn.AdaptiveMaxPool1d对输入的多维数据进行一维平面上的自适应最大池化运算。mindspore.nn.AdaptiveMaxPool2d二维自适应最大池化运算。mindspore.nn.AvgPool1d对输入的多维数据进行一维平面上的平均池化运算。mindspore.nn.AvgPool2d对输入的多维数据进行二维的平均池化运算。mindspore.nn.MaxPool1d对时间数据进行最大池化运算。mindspore.nn.MaxPool2d对输入的多维数据进行二维的最大池化运算。5. 图像处理层接口名概述mindspore.nn.CentralCrop根据指定比例裁剪出图像的中心区域。mindspore.nn.ImageGradients计算每个颜色通道的图像渐变,返回为两个Tensor,分别表示高和宽方向上的变化率。mindspore.nn.MSSSIM多尺度计算两个图像之间的结构相似性(SSIM)。mindspore.nn.PSNR在批处理中计算两个图像的峰值信噪比(PSNR)。mindspore.nn.ResizeBilinear使用双线性插值调整输入Tensor为指定的大小。mindspore.nn.SSIM计算两个图像之间的结构相似性(SSIM)。因为篇幅原因,这里就不全部介绍了,后面会继续更新
  • [执行问题] 为什么Mindspore.numpy不支持tensor转换
    如图我想把numpy创建的变量转换成Tensor但是报错如图
  • [经验分享] 基于MindStudio的Resnet50深度学习模型开发
    基于MindStudio的MindX SDK应用开发全流程目录一、MindStudio介绍与安装 21 MindStudio介绍 22 MindSpore 安装 4二、MindX SDK介绍与安装 51 MindX SDK介绍 52.MindX SDK安装 6三、可视化流程编排介绍 81 SDK 基础概念 82.可视化流程编排 8四、SE-Resnet介绍 10五、开发过程 101 创建工程 102 代码开发 123 数据准备及模型准备 144 模型转换功能介绍 155 运行测试 16六、遇见的问题 21MindStudio介绍与安装相关课程:昇腾全流程开发工具链(MindStudio)本课程主要介绍MindStudio在昇腾AI开发中的使用,作为昇腾AI全栈中的全流程开发工具链,提供覆盖训练模型、推理应用和自定义算子开发三个场景下端到端工具,极大提高开发效率。建议开发前,学习该课程的第1章和第3章,可以少走很多弯路!!!MindStudio介绍MindStudio提供您在AI开发所需的一站式开发环境,支持模型开发、算子开发以及应用开发三个主流程中的开发任务。依靠模型可视化、算力测试、IDE本地仿真调试等功能,MindStudio能够帮助您在一个工具上就能高效便捷地完成AI应用开发。MindStudio采用了插件化扩展机制,开发者可以通过开发插件来扩展已有功能。功能简介针对安装与部署,MindStudio提供多种部署方式,支持多种主流操作系统,为开发者提供最大便利。针对网络模型的开发,MindStudio支持TensorFlow、Pytorch、MindSpore框架的模型训练,支持多种主流框架的模型转换。集成了训练可视化、脚本转换、模型转换、精度比对等工具,提升了网络模型移植、分析和优化的效率。针对算子开发,MindStudio提供包含UT测试、ST测试、TIK算子调试等的全套算子开发流程。支持TensorFlow、PyTorch、MindSpore等多种主流框架的TBE和AI CPU自定义算子开发。针对应用开发,MindStudio集成了Profiling性能调优、编译器、MindX SDK的应用开发、可视化pipeline业务流编排等工具,为开发者提供了图形化的集成开发环境,通过MindStudio能够进行工程管理、编译、调试、性能分析等全流程开发,能够很大程度提高开发效率。功能框架MindStudio功能框架如图1-1所示,目前含有的工具链包括:模型转换工具、模型训练工具、自定义算子开发工具、应用开发工具、工程管理工具、编译工具、流程编排工具、精度比对工具、日志管理工具、性能分析工具、设备管理工具等多种工具。图1-1 工具链功能架构工具功能MindStudio工具中的主要几个功能特性如下:工程管理:为开发人员提供创建工程、打开工程、关闭工程、删除工程、新增工程文件目录和属性设置等功能。SSH管理:为开发人员提供新增SSH连接、删除SSH连接、修改SSH连接、加密SSH密码和修改SSH密码保存方式等功能。应用开发:针对业务流程开发人员,MindStudio工具提供基于AscendCL(Ascend Computing Language)和集成MindX SDK的应用开发编程方式,编程后的编译、运行、结果显示等一站式服务让流程开发更加智能化,可以让开发者快速上手。自定义算子开发:提供了基于TBE和AI CPU的算子编程开发的集成开发环境,让不同平台下的算子移植更加便捷,适配昇腾AI处理器的速度更快。离线模型转换:训练好的第三方网络模型可以直接通过离线模型工具导入并转换成离线模型,并可一键式自动生成模型接口,方便开发者基于模型接口进行编程,同时也提供了离线模型的可视化功能。日志管理:MindStudio为昇腾AI处理器提供了覆盖全系统的日志收集与日志分析解决方案,提升运行时算法问题的定位效率。提供了统一形式的跨平台日志可视化分析能力及运行时诊断能力,提升日志分析系统的易用性。性能分析:MindStudio以图形界面呈现方式,实现针对主机和设备上多节点、多模块异构体系的高效、易用、可灵活扩展的系统化性能分析,以及针对昇腾AI处理器的性能和功耗的同步分析,满足算法优化对系统性能分析的需求。设备管理:MindStudio提供设备管理工具,实现对连接到主机上的设备的管理功能。精度比对:可以用来比对自有模型算子的运算结果与Caffe、TensorFlow、ONNX标准算子的运算结果,以便用来确认神经网络运算误差发生的原因。开发工具包的安装与管理:为开发者提供基于昇腾AI处理器的相关算法开发套件包Ascend-cann-toolkit,旨在帮助开发者进行快速、高效的人工智能算法开发。开发者可以将开发套件包安装到MindStudio上,使用MindStudio进行快速开发。Ascend-cann-toolkit包含了基于昇腾AI处理器开发依赖的头文件和库文件、编译工具链、调优工具等。MindStudio安装具体安装操作请参考:MindStudio安装指南 MindStudio环境搭建指导视频场景介绍纯开发场景(分部署形态):在非昇腾AI设备上安装MindStudio和Ascend-cann-toolkit开发套件包。可作为开发环境仅能用于代码开发、编译等不依赖于昇腾设备的开发活动(例如ATC模型转换、算子和推理应用程序的纯代码开发)。如果想运行应用程序或进行模型训练等,需要通过MindStudio远程连接功能连接已部署好运行环境所需软件包的昇腾AI设备。开发运行场景(共部署形态):在昇腾AI设备上安装MindStudio、Ascend-cann-toolkit开发套件包、npu-firmware安装包、npu-driver安装包和AI框架(进行模型训练时需要安装)。作为开发环境,开发人员可以进行普通的工程管理、代码编写、编译、模型转换等功能。同时可以作为运行环境,运行应用程序或进行模型训练。软件包介绍MindStudio:提供图形化开发界面,支持应用开发、调试和模型转换功能,同时还支持网络移植、优化和分析等功能。Ascend-cann-toolkit:开发套件包。为开发者提供基于昇腾AI处理器的相关算法开发工具包,旨在帮助开发者进行快速、高效的模型、算子和应用的开发。开发套件包只能安装在Linux服务器上,开发者可以在安装开发套件包后,使用MindStudio开发工具进行快速开发。MindX SDK介绍与安装MindX SDK介绍MindX SDK提供昇腾AI处理器加速的各类AI软件开发套件(SDK),提供极简易用的API,加速AI应用的开发。应用开发旨在使用华为提供的SDK和应用案例快速开发并部署人工智能应用,是基于现有模型、使用pyACL提供的Python语言API库开发深度神经网络应用,用于实现目标识别、图像分类等功能。图2-1 MindX SDK总体结构通过MindStudio实现SDK应用开发分为基础开发与深入开发,通常情况下用户关注基础开发即可,基础开发主要包含如何通过现有的插件构建业务流并实现业务数据对接,采用模块化的设计理念,将业务流程中的各个功能单元封装成独立的插件,通过插件的串接快速构建推理业务。mxManufacture & mxVision关键特性:配置文件快速构建AI推理业务。插件化开发模式,将整个推理流程“插件化”,每个插件提供一种功能,通过组装不同的插件,灵活适配推理业务流程。提供丰富的插件库,用户可根据业务需求组合Jpeg解码、抠图、缩放、模型推理、数据序列化等插件。基于Ascend Computing Language(ACL),提供常用功能的高级API,如模型推理、解码、预处理等,简化Ascend芯片应用开发。支持自定义插件开发,用户可快速地将自己的业务逻辑封装成插件,打造自己的应用插件。MindX SDK安装步骤1 Windows场景下基于MindStuido的SDK应用开发,请先确保远端环境上MindX SDK软件包已安装完成,安装方式请参见《mxManufacture 用户指南》 和《mxVision 用户指南》 的“使用命令行方式开发”>“安装MindX SDK开发套件”章节。步骤2 在Windows本地进入工程创建页面,工具栏点击File > Settings > Appearance & Behavior > System Settings > MindX SDK进入MindX SDK管理界面。界面中MindX SDK Location为软件包的默认安装路径,默认安装路径为“C:\Users\用户名\Ascend\mindx_sdk”。单击Install SDK进入Installation settings界面,如图2-2。图2-2 MindX SDK管理界面如图2-3所示,为MindX SDK的安装界面,各参数选择如下:Remote Connection:远程连接的用户及IP。Remote CANN Location:远端环境上CANN开发套件包的路径,请配置到版本号一级。Remote SDK Location:远端环境上SDK的路径,请配置到版本号一级。IDE将同步该层级下的include、opensource、python、samples文件夹到本地Windows环境,层级选择错误将导致安装失败。Local SDK Location:同步远端环境上SDK文件夹到本地的路径。默认安装路径为“C:\Users\用户名\Ascend\mindx_sdk”。图2-3 MindX SDK安装界面图2-4 安装完成后的MindX SDK管理界面步骤3 单击OK结束,返回SDK管理界面,可查看安装后的SDK的信息,如图2-4所示,可单击OK结束安装流程。可视化流程编排介绍SDK基础概念通过stream(业务流)配置文件,Stream manager(业务流管理模块)可识别需要构建的element(功能元件)以及element之间的连接关系,并启动业务流程。Stream manager对外提供接口,用于向stream发送数据和获取结果,帮助用户实现业务对接。Plugin(功能插件)表示业务流程中的基础模块,通过element的串接构建成一个stream。Buffer(插件缓存)用于内部挂载解码前后的视频、图像数据,是element之间传递的数据结构,同时也允许用户挂载Metadata(插件元数据),用于存放结构化数据(如目标检测结果)或过程数据(如缩放后的图像)。图3-1 SDK业务流程相关基础单元可视化流程编排MindX SDK实现功能的最小粒度是插件,每一个插件实现特定的功能,如图片解码、图片缩放等。流程编排是将这些插件按照合理的顺序编排,实现负责的功能。可视化流程编排是以可视化的方式,开发数据流图,生成pipeline文件供应用框架使用。图 4-2 为推理业务流Stream配置文件pipeline样例。配置文件以json格式编写,用户必须指定业务流名称、元件名称和插件名称,并根据需要,补充元件属性和下游元件名称信息。步骤1 进入工程创建页面,用户可通过以下方式开始流程编排。在顶部菜单栏中选择Ascend>MindX SDK Pipeline,打开空白的pipeline绘制界面绘制,也可打开用户自行绘制好的pipeline文件,如图3-3。绘制界面分为左侧插件库、中间编辑区、右侧插件属性展示区,具体参考pipeline绘制 。步骤2 在左侧编辑框选择插件,拖动至中间编辑框,按照用户的业务流程进行连接。如果拖动错误插件或者错误连线,选中错误插件或者错误连线单击键盘Del键删除。用户自定义的流水线绘制完成后,选中流水线中的所有插件,右键选择Set Stream Name设置Stream名称,如果有多条流水线则需要对每一条流水线设置Stream名称。绘制完成单击Save保存。图3-2 Detection and Classification配置pipeline样例图3-3 pipeline绘制界面SE-Resnet50介绍残差神经网络是何凯明提出的网络.在深度学习中,网络越深往往取得的效果越好,但是设计的网络过深后若干不为零的梯度相乘导致了梯度消失的现象影响了训练,在残差神经网络中借助其残差结构可以有效的避免梯度消失的问题,在imagenet数据集上取得了优异的结果.SE-Resnet50网络结构,如图4-1所示:图4-1 SE-Resnet50网络结构开发过程创建工程步骤一:安装完成后,点击 ”New Project” 创建新的项目,进入创建工程界面。选择Ascend App项目类别,然后就是常规的命名和路径修改,在CANN Version处点击change配置远程连接和远程CANN地址。图5-1-1 创建项目步骤二:点击CANN Version的Change后进入下界面进行相关配置,点击远程连接配置的最右侧添加按钮。图5-1-2 远程连接配置步骤三:在定义SSH 配置、保存远程服务器的连接配置后返回Remote CANN Setting界面,继续配置CANN location。加载完后再点击Finish即可完成远程环境配置。图5-1-3 配置CANN location 步骤四:完成远程连接设置后,点击next会进到模板选择界面,由于我们是推理任务,此时我们选择MindX SDK Project(Python),再点击Finish。MindX SDK(昇腾行业SDK),提供面向不同行业使能的开发套件,简化了使用昇腾芯片推理业务开发的过程。SDK提供了以昇腾硬件功能为基础的功能插件,用户可以通过拼接功能插件,快速构建推理业务,也可以开发自定义功能插件。图5-1-4 MindX SDK Project(Python)代码开发代码地址:cid:link_2SDK相关工程目录结构:代码介绍:acc.py:求精度代码,在得到sdk推理结果之后会运行acc.py来求精度,具体使用会在本章节运行测试部分详细展示.data_to_bin.py:数据预处理代码,会将数据集中图片转换为二进制形式保存在路径中,具体使用会在本章节数据准备部分详细展示infer.py:里面包含了Sdk_Api这个类,其中主要使用到的函数为图5-2-1 将cv形式输入转换为sdk输入图5-2-2 得到输出结果config.py:已经写好的配置文件,运行时不需要改动Se_resnet50_ms_test.pipeline:pipeline配置文件,运行时需要修改其中的om文件路径,具体会在运行测试部分详细说明.main.py:推理时所运行的文件,会将所有经过预处理之后的二进制文件通过图5-2-1、5-2-2所示函数,得到推理结果.数据准备及模型准备数据集使用的是imagenet,在infer/sdk/目录下先创建一个文件夹“./dataset”,将910上经过数据预处理的图片保存为二进制放入,具体操作如下:在910服务器上执行文件data_to_bin.py图5-3-1 data_to_bin.py配置参数在文件中将数据集路径如图5-3-1所示改为实际路径之后,运行python data_to_bin.py.运行之后./dataset中会生成images与target两个文件夹,里面分别为图片经过预处理之后保存的二进制文件以及每个图片对应的lebel.图5-3-2 生成的images与target的二进制文件在准备好二进制文件后在910上导出onnx模型文件,保存到infer/sdk目录下。具体操作如下:按照图5-3-3所示位置修改pth路径信息,之后运行python pthtar2onnx.py图5-3-3修改pth路径信息图5-3-4 生成onnx文件运行之后会生成图5-3-4所示onnx文件.模型转换功能介绍用户使用torch框架训练好的第三方模型,在导出onnx模型后,可通过ATC工具将其转换为昇腾AI处理器支持的离线模型(*.om文件),模型转换过程中可以实现算子调度的优化、权重数据重排、内存使用优化等,可以脱离设备完成模型的预处理,详细架构如图5-4-1所示。图5-4-1 ATC工具功能架构在本项目中,要将pytorch框架下训练好的模型(*.onnx文件),转换为昇腾AI处理器支持的离线模型(*.om文件),具体步骤如下:步骤1 点击Ascend > Model Converter,进入模型转换界面,参数配置如图5-4-2所示,若没有CANN Machine,请参见第六章第一节CANN安装。图5-4-2 模型转换界面1各参数解释如下表所示:CANN MachineCANN的远程服务器Model File*.onnx文件的路径(可以在本地,也可以在服务器上)Model Name生成的om模型名字Output Path生成的om模型保存在本地的路径步骤2 点击Next进入图5-4-3界面,该项目数据不需要预处理,直接点击Next,进入图5-4-4界面,再点击Finish开始模型转换。图5-4-3 模型转换界面2图5-4-4 模型转换界面3步骤3 等待出现如图5-4-5所示的提示,模型转换成功图5-4-5模型转换成功运行测试步骤1 修改“sdk/config/SE-resnet50.pipeline”中的参数,具体操作如图5-5-1所示;图5-5-1 修改pipeline中*.om文件路径步骤2 在MindStudio工程界面,依次选择“Run > Edit Configurations...”,进入运行配置页面。选择“Ascend App > 工程名”配置应用工程运行参数,图5-5-2为配置示例。配置完成后,单击“Apply”保存运行配置,单击“OK”,关闭运行配置窗口。图5-5-2 工程推理工程运行参数配置在本工程中,推理时运行文件选择main.py,运行参数为--img_path [LR_path] --dataset_name images --pipeline_path [pipeline_path] python3 main.py --img_path "/home/data/xd_mindx/csl/val/" --dataset_name images --pipeline_path "/home/data/xd_mindx/csl/infer/sdk/config/SE-resnet50_test.pipeline"参数解释如下表:参数解释我的设置img_path推理图片路径./val/images/pipeline_pathPipeline文件路径./config/Se_resnet50_ms_test.pipelineinfer_result_dir推理结果保存路径./infer_result/images/images/步骤3 点击运行,出现如图5-5-3所示提示,即为运行成功,infer_result文件夹中即为推理结果,保存为二进制形式。图5-5-3推理操作过程步骤4 配置后处理运行程序,在MindStudio工程界面,依次选择“Run > Edit Configurations...”,进入运行配置页面,如图5-5-4所示,点击“+”,后选择Python(后处理可以直接在本地运行),如图5-5-5所示。图5-5-4运行配置界面图5-5-5 运行后处理相关配置Script path运行文件路径Parameters运行时的参数如图5-5-5所示,运行文件为acc.py:步骤5 点击运行,出现如图5-5-6所示提示,即为运行成功。图5-5-6 运行后处理程序步骤6 结果分析,如图5-5-6所示,已经达到了标准精度。 遇见的问题在使用MindStudio时,遇到问题,可以登陆MindStudio昇腾论坛进行互动,提出问题,会有专家老师为你解答。模型转换时,没有CANN Machine图6-1 CANN管理界面解决方案:按以下步骤,重新安装CANN Machine步骤1 点击File>Settings>Appearance & Behavior > System Settings > CANN,进入CANN管理界面,如图6-1所示:步骤2 点击Change CANN,进入Remote CANN Setting界面,如图6-2所示重新安装CANN,点击Finish,安装CANN。6-2 Remote CANN Setting界面图6-3 安装CANN完成参数解释如下表:Remote Connection远程服务器IPRemote CANN location远程服务器中CANN路径步骤3 完成CANN安装,点击OK,重启MindStudio,如图6-3所示。MindStudio导入应用工程后,提示“No Python interpreter configured for the module”解决方案:步骤1 在顶部菜单栏中选择File > Project Structure,在Project Structure窗口中,点击Platform Settings > SDKs,点击上方的“+”添加Python SDK,从本地环境中导入Python,如图6-4所示。图6-4 导入Python SDK步骤2 点击Project Settings > Project,选择上一步添加的Python SDK,如图6-5所示。图6-5 设置Project SDK步骤3 点击Project Settings > Modules,选中“MyApp”,点击“+”后选择Python,为Python Interpreter选择上述添加的Python SDK。点击OK完成应用工程Python SDK配置,如图6-6所示。图6-6 选择Python SDK
  • [其他] 《深度学习入门》笔记 - 20
    因变量的常见数据类型有三种:定量数据、二分类定性数据和多分类定性数据。输出层激活函数的选择主要取决于因变量的数据类型。MNIST数据集是机器学习文献中常用的数据。因变量(0~9)用独热码表示,比如数字8的独热码为(0 0 0 0 0 0 0 0 1 0)数字2的读热码为(0 0 1 0 0 0 0 0 0 0)输出层激活函数的选择取决于因变量的数据类型。选定激活函数之后,需要根据建模目标选择相应的损失函数。
  • [其他] 《深度学习入门》笔记 - 18
    反向传播算法(BP Backward Propagation)是神经网络中逐层计算参数梯度的方法。我早就已经开始看不懂了,这个图还没完。这个正向传播算法和反向传播算法干啥用的呢?我的理解是用来训练神经网络模型的。因为中间加了很多隐藏层,隐藏层也是需要将损失最小化的呀,所以需要引入这两个算法。神经网络的目的是建立输入层与输出层之间的关系,进而利用建立的关系得到预测值。通过增加隐藏层,神经网络可以找到输入层与输出层之间较复杂的关系。深度学习是拥有多个隐藏层的神经网络,在神经网络中,我们通过正向传播算法得到预测值,并通过反向传播算法得到参数梯度,然后利用梯度下降法更新参数,使得模型误差变小,最终得到一个训练好的神经网络模型。在神经网络中,只要知道神经网络的结构,就可以自动的计算参数梯度,进而训练神经网络。因此,无论神经网络模型的结构有多复杂,我们都可以使用一套既定的算法训练神经网络模型。
  • [问题求助] 模型迁移pb2om
    运行结果如上这是日志
  • [其他] 浅谈人工神经网络发展历史
    人工神经网络(Artificial Neural Network,即ANN ),是20世纪80 年代以来人工智能领域兴起的研究热点。它从信息处理角度对人脑神经元网络进行抽象, 建立某种简单模型,按不同的连接方式组成不同的网络。在工程与学术界也常直接简称为神经网络或类神经网络。神经网络是一种运算模型,由大量的节点(或称神经元)之间相互联接构成。每个节点代表一种特定的输出函数,称为激励函数(activation function)。每两个节点间的连接都代表一个对于通过该连接信号的加权值,称之为权重,这相当于人工神经网络的记忆。网络的输出则依网络的连接方式,权重值和激励函数的不同而不同。而网络自身通常都是对自然界某种算法或者函数的逼近,也可能是对一种逻辑策略的表达。最近十多年来,人工神经网络的研究工作不断深入,已经取得了很大的进展,其在模式识别、智能机器人、自动控制、预测估计、生物、医学、经济等领域已成功地解决了许多现代计算机难以解决的实际问题,表现出了良好的智能特性。    发展历史1943年,心理学家W.S.McCulloch和数理逻辑学家W.Pitts建立了神经网络和数学模型,称为MP模型。他们通过MP模型提出了神经元的形式化数学描述和网络结构方法,证明了单个神经元能执行逻辑功能,从而开创了人工神经网络研究的时代。1949年,心理学家提出了突触联系强度可变的设想。60年代,人工神经网络得到了进一步发展,更完善的神经网络模型被提出,其中包括感知器和自适应线性元件等。M.Minsky等仔细分析了以感知器为代表的神经网络系统的功能及局限后,于1969年出版了《Perceptron》一书,指出感知器不能解决高阶谓词问题。他们的论点极大地影响了神经网络的研究,加之当时串行计算机和人工智能所取得的成就,掩盖了发展新型计算机和人工智能新途径的必要性和迫切性,使人工神经网络的研究处于低潮。在此期间,一些人工神经网络的研究者仍然致力于这一研究,提出了适应谐振理论(ART网)、自组织映射、认知机网络,同时进行了神经网络数学理论的研究。以上研究为神经网络的研究和发展奠定了基础。1982年,美国加州工学院物理学家J.J.Hopfield提出了Hopfield神经网格模型,引入了“计算能量”概念,给出了网络稳定性判断。 1984年,他又提出了连续时间Hopfield神经网络模型,为神经计算机的研究做了开拓性的工作,开创了神经网络用于联想记忆和优化计算的新途径,有力地推动了神经网络的研究,1985年,又有学者提出了波耳兹曼模型,在学习中采用统计热力学模拟退火技术,保证整个系统趋于全局稳定点。1986年进行认知微观结构地研究,提出了并行分布处理的理论。1986年,Rumelhart, Hinton, Williams发展了BP算法。Rumelhart和McClelland出版了《Parallel distribution processing: explorations in the microstructures of cognition》。迄今,BP算法已被用于解决大量实际问题。1988年,Linsker对感知机网络提出了新的自组织理论,并在Shanon信息论的基础上形成了最大互信息理论,从而点燃了基于NN的信息应用理论的光芒。1988年,Broomhead和Lowe用径向基函数(Radial basis function, RBF)提出分层网络的设计方法,从而将NN的设计与数值分析和线性适应滤波相挂钩。90年代初,Vapnik等提出了支持向量机(Support vector machines, SVM)和VC(Vapnik-Chervonenkis)维数的概念。人工神经网络的研究受到了各个发达国家的重视,美国国会通过决议将1990年1月5日开始的十年定为“脑的十年”,国际研究组织号召它的成员国将“脑的十年”变为全球行为。在日本的“真实世界计算(RWC)”项目中,人工智能的研究成了一个重要的组成部分。
  • [执行问题] 基于mindspore的yolov5在modelarts复现,出现input_shape问题
    Traceback (most recent call last): File "train.py", line 253, in run_train() File "/home/ma-user/work/yolov5/model_utils/moxing_adapter.py", line 105, in wrapped_func run_func(*args, **kwargs) File "train.py", line 225, in run_train batch_gt_box2, input_shape) File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/mindspore/nn/cell.py", line 404, in call out = self.compile_and_run(*inputs) File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/mindspore/nn/cell.py", line 682, in compile_and_run self.compile(*inputs) File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/mindspore/nn/cell.py", line 669, in compile _cell_graph_executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode) File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/mindspore/common/api.py", line 548, in compile result = self._graph_executor.compile(obj, args_list, phase, use_vm, self.queue_name) File "/home/ma-user/anaconda3/envs/MindSpore/lib/python3.7/site-packages/mindspore/ops/operations/array_ops.py", line 534, in infer raise ValueError(f"For '{self.name}', the shape of 'input_x' is {x_shp}, " ValueError: For 'Reshape', the shape of 'input_x' is [32, 505, 20, 20], the value of 'input_shape' value is [32, 3, 505, 20, 20]. The product of the shape of 'input_x' should be equal to product of 'input_shape', but product of the shape of 'input_x' is 6464000, product of 'input_shape' is 19392000.
总条数:944 到第
上滑加载中