• [技术干货] TensorFlow Transformer 自定义实现(IMDB情感分类)
    Transformer是当下最先进的深度学习架构之一,它被广泛应用与自然语言处理领域和视觉领域。并且替代了以前的循环神经网络(RNN和LSTM),并且以此为基础衍生了诸如BERT、GPT-3等知名网络架构。本文将介绍如何使用TensorFlow原生API从零实现Transformer多头自注意力机制,并在IMDB数据集上验证网络的性能,模型的训练结果如下:TensorFlow极简代码实现可以参考Notebook:
  • [课程学习] Tensor学习总结
    Generator的使用:直接读取图像数据:train_generator=ImageDataGenerator(rescale=1./255)test_generator=ImageDataGenerator(rescale=1./255)train_data=train_generator.flow_from_directory(    "./cats_and_dogs_filtered/train",batch_size=20,target_size=(64,64),shuffle=True,class_mode='binary')test_data=test_generator.flow_from_directory(    "./cats_and_dogs_filtered/validation",batch_size=1000,target_size=(64,64),shuffle=False,class_mode='binary')创建模型:model=tf.keras.models.Sequential([    tf.keras.layers.Conv2D(64,3,activation='relu',input_shape=(64,64,3)),    tf.keras.layers.MaxPooling2D(3,3),    tf.keras.layers.Conv2D(128,3,activation='relu'),    tf.keras.layers.MaxPooling2D(3,3),    tf.keras.layers.Flatten(),    tf.keras.layers.Dense(512,activation='relu',kernel_initializer='random_normal',kernel_regularizer=tf.keras.regularizers.l2(0.01)),    tf.keras.layers.Dense(256,activation='relu',kernel_initializer='random_normal',kernel_regularizer=tf.keras.regularizers.l2(0.01)),    tf.keras.layers.Dense(1,activation='sigmoid',kernel_initializer='random_normal',kernel_regularizer=tf.keras.regularizers.l2(0.01))])model.summary()model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5),              loss=tf.keras.losses.BinaryCrossentropy(),              metrics=["Accuracy"])model.fit_generator(train_data,steps_per_epoch=100,epochs=200,validation_data=test_data,validation_steps=1,)读取mnist数据:x_test , y_test = loadlocal_mnist(                images_path='D:\\Python\\MNIST_data\\t10k-images.idx3-ubyte',                 labels_path='D:\\Python\\MNIST_data\\t10k-labels.idx1-ubyte')x_train, y_train = loadlocal_mnist(                images_path='D:\\Python\\MNIST_data\\train-images.idx3-ubyte',                 labels_path='D:\\Python\\MNIST_data\\train-labels.idx1-ubyte')
  • [开发环境] notebook的tf预置镜像使用Ascend910失败
    镜像:tensorflow1.15-cann5.1.0-py3.7-euler2.8.3规格:Ascend: 1*Ascend910|ARM: 24核 96GB参考Step1 在Notebook中拷贝模型包_AI开发平台ModelArts_镜像管理_使用自定义镜像创建AI应用(推理部署)_无需构建直接在开发环境中调试并保存镜像用于推理_华为云 (huaweicloud.com)调试个人tensorflow的SavedModel格式模型执行run.sh后,发现模型被加载内存中,调用npu-smi info发现没有任何占用,HBM为0%,没有使用910。请问这是哪里出现了异常,有什么调试的方向.......?
  • [问题求助] Loop OP不支持
    pb转rknn脚本:https://developer.huaweicloud.com/develop/aigallery/notebook/detail?id=a0127cce-28ac-49e5-911a-d2f82173da95inference_model/frozen_inference_graph.pbW load_tensorflow: Catch exception when loading tensorflow model: inference_model/frozen_inference_graph.pb! W load_tensorflow: Make sure that the tensorflow version of 'inference_model/frozen_inference_graph.pb' is consistent with the installed tensorflow version '2.6.2'! E load_tensorflow: Traceback (most recent call last): E load_tensorflow: File "rknn/api/rknn_base.py", line 1042, in rknn.api.rknn_base.RKNNBase.load_tensorflow E load_tensorflow: File "rknn/api/rknn_base.py", line 575, in rknn.api.rknn_base.RKNNBase._create_ir_and_inputs_meta E load_tensorflow: File "rknn/api/ir_graph.py", line 44, in rknn.api.ir_graph.IRGraph.__init__ E load_tensorflow: File "rknn/api/ir_graph.py", line 343, in rknn.api.ir_graph.IRGraph.rebuild E load_tensorflow: File "rknn/api/ir_graph.py", line 185, in rknn.api.ir_graph.IRGraph._clean_model E load_tensorflow: File "rknn/api/ir_graph.py", line 93, in rknn.api.ir_graph.IRGraph.infer_shapes E load_tensorflow: File "/home/ma-user/anaconda3/envs/py36/lib/python3.6/site-packages/onnx/checker.py", line 104, in check_model E load_tensorflow: C.check_model(protobuf_string) E load_tensorflow: onnx.onnx_cpp2py_export.checker.ValidationError: Nodes in a graph must be topologically sorted, however input 'Preprocessor/map/while/ResizeImage/stack_1:0' of node: E load_tensorflow: name: sub_graph_ending_node_Identity__20 OpType: Identity E load_tensorflow: is not output of any previous nodes. E load_tensorflow: ==> Context: Bad node spec for node. Name: generic_loop_Loop__33 OpType: Looponnx2tflite报错:
  • [其他问题] 无法将冻结的模型文件.pb转换为.tflite
    由于.pb直接转换rknn失败,想先转成.tflite,又遇到了如下问题:F tensorflow/lite/toco/tooling_util.cc:2277] Check failed: array.data_type == array.final_data_type Array "image_tensor" has mis-matching actual and final data types (data_type=uint8, final_data_type=float). Fatal Python error: Aborted转换脚本:https://developer.huaweicloud.com/develop/aigallery/notebook/detail?id=a6ab86f1-8767-41ee-b1a2-28e58f62c43d运行环境:tensorflow 1.15
  • atc转换错误
    按照流程装了运行环境,报错如图
  • [技术干货] ACGAN-动漫头像自动生成
    ACGAN论文:Conditional Image Synthesis with Auxiliary Classifier GANs使用标签的数据集应用于生成对抗网络可以增强现有的生成模型,并形成两种优化思路。cGAN使用了辅助的标签信息来增强原始GAN,对生成器和判别器都使用标签数据进行训练,从而实现模型具备产生特定条件数据的能力。SGAN的结构来利用辅助标签信息(少量标签),利用判别器或者分类器的末端重建标签信息。 ACGAN则是结合以上两种思路对GAN进行优化。ACGAN目标函数:对于生成器来说有两个输入,一个是标签的分类数据c,另一个是随机数据z,得到生成数据为 ; 对于判别器分别要判断数据源是否为真实数据的概率分布 ,以及数据源对于分类标签的概率分布ACGAN的目标函数包含两部分: 第一部分 是面向数据真实与否的代价函数 第二部分 则是数据分类准确性的代价函数。在优化过程中希望判别器D能否使得 + 尽可能最大,而生成器G使得 - 尽可能最大; 简而言之是希望判别器能够尽可能区分真实数据和生成数据并且能有效对数据进行分类,对生成器来说希望生成数据被尽可能认为是真实数据且数据都能够被有效分类。1.本案例使用框架:TensorFlow 1.13.12.本案例使用硬件:GPU: 1*NVIDIA-V100NV32(32GB) | CPU: 8 核 64GB3.运行代码方法: 点击本页面顶部菜单栏的三角形运行按钮或按Ctrl+Enter键 运行每个方块中的代码4.JupyterLab的详细用法: 请参考《ModelAtrs JupyterLab使用指导》5.碰到问题的解决办法: 请参考《ModelAtrs JupyterLab常见问题解决办法》1.下载模型和代码import os!wget https://obs-aigallery-zc.obs.cn-north-4.myhuaweicloud.com/algorithm/ACGAN.zip# 解压缩os.system('unzip ACGAN.zip -d ./')2.模型训练2.1加载依赖库root_path = './ACGAN/'os.chdir(root_path)import osfrom main import mainfrom ACGAN import ACGANfrom tools import checkFolderimport tensorflow as tfimport argparseimport numpy as np2.2设置参数def parse_args(): note = "ACGAN Frame Constructed With Tensorflow" parser = argparse.ArgumentParser(description=note) parser.add_argument("--epoch",type=int,default=251,help="训练轮数") parser.add_argument("--batchSize",type=int,default=64,help="batch的大小") parser.add_argument("--codeSize",type=int,default=62,help="输入编码向量的维度") parser.add_argument("--checkpointDir",type=str,default="./checkpoint",help="检查点保存目录") parser.add_argument("--resultDir",type=str,default="./result",help="训练过程中,中间生成结果的目录") parser.add_argument("--logDir",type=str,default="./log",help="训练日志目录") parser.add_argument("--mode",type=str,default="train",help="模式: train / infer") parser.add_argument("--hairStyle",type=str,default="orange hair",help="你想要生成的动漫头像的头发颜色") parser.add_argument("--eyeStyle",type=str,default="gray eyes",help="你想要生成的动漫头像的眼睛颜色") parser.add_argument("--dataSource",type=str,default='./extra_data/images/',help="训练集路径") args, unknown= parser.parse_known_args() checkFolder(args.checkpointDir) checkFolder(args.resultDir) checkFolder(args.logDir) assert args.epoch>=1 assert args.batchSize>=1 assert args.codeSize>=1 return argsargs =parse_args()2.3开始训练with tf.Session() as sess : myGAN = ACGAN(sess,args.epoch,args.batchSize,args.codeSize,\ args.dataSource,args.checkpointDir,args.resultDir,args.logDir,args.mode,\ 64,64,3) if myGAN is None: print("创建GAN网络失败") exit(0) if args.mode=='train' : myGAN.buildNet() print("进入训练模式") myGAN.train() print("Done")开始加载数据集!images.shape: (3000, 64, 64, 3)labels.shape: (3000, 23)Loading images to numpy array...Random shuffling images and labels...[Tip 1] Normalize the images between -1 and 1.数据集加载成功!numOfBatches : 46网络实例化:WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/ops/tensor_array_ops.py:162: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.Instructions for updating:Colocations handled automatically by placer.WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py:1624: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.Instructions for updating:Use keras.layers.flatten instead.WARNING:tensorflow:From /home/ma-user/work/ACGAN/ACGAN.py:167: batch_normalization (from tensorflow.python.layers.normalization) is deprecated and will be removed in a future version.Instructions for updating:Use keras.layers.batch_normalization instead.已构建 Loss for Discriminator已构建 Loss for Generator# size of dVars : 55# size of gVars : 148WARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.Instructions for updating:Use tf.cast instead.已构建优化器已构造预测器网络实例化成功!进入训练模式开始配置训练环境!模型将会被加载 : ./checkpoint/ACGANWARNING:tensorflow:From /home/ma-user/anaconda3/envs/TensorFlow-1.13.1/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.Instructions for updating:Use standard file APIs to check for files with this prefix.INFO:tensorflow:Restoring parameters from ./checkpoint/ACGAN/ACGAN.model-251MODEL NAME : ACGAN.model-251模型加载成功 : ACGAN.model-251加载成功生成模型结果预览训练开始!~~~~~~~~~~~~~~~~~~~~~~~~251 251Done3.测试模型修改参数从训练模式为推理模式args.mode ='infer'从标签里选择你想要生成的头像的头发和眼睛,只能从这两个列表里选择hair_dict = ['orange hair', 'white hair', 'aqua hair', 'gray hair', 'green hair', 'red hair', 'purple hair', 'pink hair', 'blue hair', 'black hair', 'brown hair', 'blonde hair']eye_dict = [ 'gray eyes', 'black eyes', 'orange eyes', 'pink eyes', 'yellow eyes', 'aqua eyes', 'purple eyes', 'green eyes', 'brown eyes', 'red eyes', 'blue eyes']# 选择了黄头发和灰眼睛args.hairStyle = 'orange hair'args.eyeStyle = 'gray eyes'构造预测器tf.reset_default_graph()with tf.Session() as sess : myGAN1 = ACGAN(sess,args.epoch,args.batchSize,args.codeSize,\ args.dataSource,args.checkpointDir,args.resultDir,args.logDir,args.mode,\ 64,64,3) if myGAN1 is None: print("创建GAN网络失败") exit(0) if args.mode=='infer' : myGAN1.buildForInfer() tag_dict = ['orange hair', 'white hair', 'aqua hair', 'gray hair', 'green hair', 'red hair', 'purple hair', 'pink hair', 'blue hair', 'black hair', 'brown hair', 'blonde hair','gray eyes', 'black eyes', 'orange eyes', 'pink eyes', 'yellow eyes','aqua eyes', 'purple eyes', 'green eyes', 'brown eyes', 'red eyes','blue eyes'] tag = np.zeros((64,23)) feature = args.hairStyle+" AND "+ args.eyeStyle for j in range(25): for i in range(len(tag_dict)): if tag_dict[i] in feature: tag[j][i] = 1 myGAN1.infer(tag,feature) print("Generate : "+feature)模型将会被加载 : ./checkpoint/ACGANINFO:tensorflow:Restoring parameters from ./checkpoint/ACGAN/ACGAN.model-251MODEL NAME : ACGAN.model-251模型加载成功 : ACGAN.model-251已构造预测器Generate : orange hair AND gray eyes开始生成黄色头发,灰色眼睛的动漫头像存在生成不了正确头像的情况import matplotlib.pyplot as pltfrom PIL import Imagefeature = args.hairStyle+" AND "+ args.eyeStyleresultPath = './samples/' + feature + '.png' #确定保存路径img = Image.open(resultPath).convert('RGB')plt.figure(1)plt.imshow(img)plt.show()
  • [其他问题] TFplugin如何安装
    我下载了社区版的x86_64的安装包,但不知道如何安装。我用的是windows系统,没有什么命令行代码经验。谢谢各位!
  • [问题求助] 自定义算子开发(TIK方式) 获取一个tensor类型里的具体数值
    例如a = Tensor([1]),想让b = 1,也就是取到只有一个元素的Tensor变量中的那个值,并以单独的数字的形式输出,请问在tik方式开发自定义算子中如何通过函数实现?Pytorch中可以通过a.item()取得,但tik中似乎没有item()函数。
  • [问题求助] 运行add算子 ut测试时报错 no modules named _ctypes
    MindStudio版本: MindStudio 5.0.RC2安装环境: Win10 / LinuxCANN版本:CANN 5.1.RC2
  • [网络迁移] TensorFlow迁移到昇腾处理器,有一个auto-tune选项,能否提高精度
    《AutoTune自动调优》教程链接如下https://www.hiascend.com/document/detail/zh/canncommercial/51RC2/modeldev/tfmigr1/tfmigr_mprtg_0031.html文档中写的是通过优化算子调度,来提高性能,实操后发现,性能提升有限,但是精度提升较大。未开启 batch 1100 | examples/s: 25.52 | loss: 1.13622 | time elapsed: 0.15h | time left: 24.25h开 启 batch 1100 | examples/s: 30.52 | loss: 1.13622 | time elapsed: 0.15h | time left: 24.25h但是精度提升较大。疑问点:1、华为昇腾的《AutoTune自动调优》也是基于TVM的Auto Tune吗?2、Auto Tune的优化,会一定程度上优化精度吗?
  • [问题求助] 昇腾官方在Gitee中的tf_adapter_2.x插件只能适配2.6.5版本吗?
    该问题是怎么引起的?尝试构建Docker镜像,其中TensorFlow版本为2.7.4,插件采用tf_adapter_2.x中的代码重现步骤使用Dockerfile构建Docker镜像FROM ubuntu:18.04ARG HOST_ASCEND_BASE=/usr/local/AscendARG NNAE_PATH=/usr/local/Ascend/nnae/latestARG INSTALL_ASCEND_PKGS_SH=install_ascend_pkgs.shWORKDIR /tmp# 更新软件源RUN apt update && \ apt install -y --no-install-recommends ca-certificates wget && \ cp -a /etc/apt/sources.list /etc/apt/sources.list.bak && \ wget --no-check-certificate -O /etc/apt/sources.list https://repo.huaweicloud.com/repository/conf/Ubuntu-Ports-bionic.list && \ apt update && \ apt upgrade -y# 安装软件包RUN apt install -y --no-install-recommends autoconf automake dos2unix g++ libbz2-dev libssl-dev libtool libxml2 make pciutils unzip vim wget xz-utils zip \ bzip2 libblas3 libffi-dev libfreetype6-dev libgl1-mesa-glx liblapack3 liblzma-dev libopenblas-dev libpng-dev numactl pkg-config zlib1g zlib1g-dev \ ca-certificates curl cython3 gcc gfortran htop less libblas-dev libgmpxx4ldbl libhdf5-dev libicu60 libxml2-dev libxslt-dev openssl python3-h5py sudo swig \ gcc git htop inetutils-ping openssh-server ssh tmux \ build-essential openjdk-11-jdk zip unzip && \ apt clean && \ rm -rf /var/lib/apt/lists/*ENV LD_LIBRARY_PATH=/usr/local/gcc7.3.0/lib64:${LD_LIBRARY_PATH}# 安装cmakeCOPY cmake-3.15.7.tar.gz ./RUN tar -zxf cmake-3.15.7.tar.gz && \ cd cmake-3.15.7 && \ ./bootstrap && \ make -j 96 && \ make install && \ ln -s /usr/local/cmake/bin/cmake /usr/bin/cmake # 安装python和pipCOPY Python-3.7.5.tar.xz ./RUN tar -xf Python-3.7.5.tar.xz && \ cd Python-3.7.5 && \ ./configure --prefix=/usr/local/python3.7.5 --enable-shared && \ make -j 96 && \ make install && \ ln -sf /usr/local/python3.7.5/bin/python3 /usr/bin/python && \ ln -sf /usr/local/python3.7.5/bin/python3 /usr/bin/python3 && \ ln -sf /usr/local/python3.7.5/bin/python3 /usr/local/bin/python && \ ln -sf /usr/local/python3.7.5/bin/python3 /usr/local/bin/python3 && \ ln -sf /usr/local/python3.7.5/bin/pip3 /usr/bin/pip && \ ln -sf /usr/local/python3.7.5/bin/pip3 /usr/bin/pip3 && \ cd .. && \ rm -rf Python*ENV LD_LIBRARY_PATH=/usr/local/python3.7.5/lib:$LD_LIBRARY_PATHENV PATH=/usr/local/python3.7.5/bin:$PATH # 配置python pip源RUN mkdir -p ~/.pip && \ echo '[global]\n\ index-url = https://repo.huaweicloud.com/repository/pypi/simple/\n\ trusted-host = repo.huaweicloud.com\n\ timeout = 120' >> ~/.pip/pip.conf# Bazel安装RUN git config --global http.proxy socks5://172.168.3.3:7891 && \ git config --global https.proxy socks5://172.168.3.3:7891ENV http_proxy=http://172.168.3.3:7890ENV https_proxy=https://172.168.3.3:7890COPY bazel-3.7.2-dist.zip ./RUN mkdir bazel-3.7.2-dist && \ unzip -q bazel-3.7.2-dist.zip -d bazel-3.7.2-dist && \ cd bazel-3.7.2-dist && \ env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh && \ cp output/bazel /usr/local/bin# 安装python包RUN pip3 install -U pip && \ pip3 install wheel && \ pip3 install setuptools&& \ pip3 install matplotlib && \ pip3 install opencv-python && \ pip3 install sklearn && \ pip3 install pandas && \ pip3 install pycocotools && \ pip3 install tables && \ pip3 install mmcv && \ pip3 install lxml && \ pip3 install easydict && \ pip3 install jupyter && \ pip3 install jupyterlab && \ pip3 install backports.lzma && \ pip3 install keras-preprocessing && \ pip3 install six && \ pip3 install Cython && \ pip3 install h5py==2.8.0 && \ rm -rf /root/.cache/pip# 拷贝相关文件COPY . ./# Ascend包RUN bash $INSTALL_ASCEND_PKGS_SH# 环境变量ENV GLOG_v=2ENV TBE_IMPL_PATH=$NNAE_PATH/opp/op_impl/built-in/ai_core/tbeENV FWK_PYTHON_PATH=$NNAE_PATH/fwkacllib/python/site-packagesENV PATH=$NNAE_PATH/fwkacllib/ccec_compiler/bin/:$PATHENV ASCEND_OPP_PATH=$NNAE_PATH/oppENV PYTHONPATH=$HOST_ASCEND_BASE/tfplugin/latest/tfplugin/python/site-packages:\$FWK_PYTHON_PATH:\$FWK_PYTHON_PATH/auto_tune.egg:\$FWK_PYTHON_PATH/schedule_search.egg:\$TBE_IMPL_PATH:\$PYTHONPATHENV LD_LIBRARY_PATH=$NNAE_PATH/fwkacllib/lib64:\/usr/local/Ascend/driver/lib64/common/:\/usr/local/Ascend/driver/lib64/driver/:\/usr/local/Ascend/add-ons/:\/usr/local/Ascend/driver/tools/hccn_tool/:\$LD_LIBRARY_PATH# TensorFlow安装ENV PYTHON_BIN_PATH=/usr/local/python3.7.5/bin/python3ENV PYTHON_LIB_PATH=/usr/local/python3.7.5/lib/python3.7/site-packagesRUN cd tensorflow-2.6.5 && \ ./configure && \ cp /tmp/.tf_configure.bazelrc . && \ bazel build --jobs=128 --local_ram_resources=102400 --copt=-D_GLIBCXX_USE_CXX11_ABI=0 //tensorflow/tools/pip_package:build_pip_package && \ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkgENV http_proxy=ENV https_proxy=RUN pip3 install /tmp/tensorflow_pkg/tensorflow*.whl# NPU_Device安装RUN pip3 install --upgrade numpyRUN pip3 listENV ADAPTER_TARGET_PYTHON_PATH=/usr/local/python3.7.5/bin/python3ENV ASCEND_INSTALLED_PATH=/usr/local/Ascend/ascend-toolkit/latestRUN cd tensorflow/tf_adapter_2.x && \ ./configure && \ mkdir build && \ cd build && \ cmake .. && \ make -j 96 && \ pip3 install --upgrade ./dist/python/dist/npu_device-0.1-py3-none-any.whl# 收尾工作RUN pip3 install tensorflow-io# HwHiAiUser, hwMindXRUN useradd -d /home/hwMindX -u 9000 -m -s /bin/bash hwMindX && \ useradd -d /home/HwHiAiUser -u 1000 -m -s /bin/bash HwHiAiUser && \ usermod -a -G HwHiAiUser hwMindX# 清理工作RUN rm -f /etc/ascend_install.info && \ rm -rf /tmp/*报错信息构建过程中弹出信息Please specify the location of python with valid tensorflow 2.6 site-packages installed. [Default is /usr/local/python3.8.15/bin/python3](You can make this quiet by set env [ADAPTER_TARGET_PYTHON_PATH]): Invalid python path: /usr/local/python3.8.15/bin/python3 compat tensorflow version is 2.6 got 2.7.4.已在Gitee仓库中提出issuetf_adapter_2.x只能适配2.6.5版本吗? · Issue #I5YODI · Ascend/tensorflow - Gitee.com
  • [问题求助] 导入npu_device报错undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb
    尝试构建支持tensorflow 2.6.5版本的Docker镜像,构建成功,且导入tensorflow正常,但导入npu_device时报错undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb感觉像是版本不匹配的问题,但是排查正常toolkit版本 5.1.RC2nnae版本 5.1.RC2Dockerfile文件:FROM ubuntu:18.04ARG HOST_ASCEND_BASE=/usr/local/AscendARG NNAE_PATH=/usr/local/Ascend/nnae/latestARG INSTALL_ASCEND_PKGS_SH=install_ascend_pkgs.shWORKDIR /tmp# 更新软件源RUN apt update && \ apt install -y --no-install-recommends ca-certificates wget && \ cp -a /etc/apt/sources.list /etc/apt/sources.list.bak && \ wget --no-check-certificate -O /etc/apt/sources.list https://repo.huaweicloud.com/repository/conf/Ubuntu-Ports-bionic.list && \ apt update && \ apt upgrade -y# 安装软件包RUN apt install -y --no-install-recommends autoconf automake dos2unix g++ libbz2-dev libssl-dev libtool libxml2 make pciutils unzip vim wget xz-utils zip \ bzip2 libblas3 libffi-dev libfreetype6-dev libgl1-mesa-glx liblapack3 liblzma-dev libopenblas-dev libpng-dev numactl pkg-config zlib1g zlib1g-dev \ ca-certificates curl cython3 gcc gfortran htop less libblas-dev libgmpxx4ldbl libhdf5-dev libicu60 libxml2-dev libxslt-dev openssl python3-h5py sudo swig \ gcc git htop inetutils-ping openssh-server ssh tmux \ build-essential openjdk-11-jdk zip unzip && \ apt clean && \ rm -rf /var/lib/apt/lists/*ENV LD_LIBRARY_PATH=/usr/local/gcc7.3.0/lib64:${LD_LIBRARY_PATH}# 安装cmakeCOPY cmake-3.15.7.tar.gz ./RUN tar -zxf cmake-3.15.7.tar.gz && \ cd cmake-3.15.7 && \ ./bootstrap && \ make -j 96 && \ make install && \ ln -s /usr/local/cmake/bin/cmake /usr/bin/cmake # 安装python和pipCOPY Python-3.7.5.tar.xz ./RUN tar -xf Python-3.7.5.tar.xz && \ cd Python-3.7.5 && \ ./configure --prefix=/usr/local/python3.7.5 --enable-shared && \ make -j 96 && \ make install && \ ln -sf /usr/local/python3.7.5/bin/python3 /usr/bin/python && \ ln -sf /usr/local/python3.7.5/bin/python3 /usr/bin/python3 && \ ln -sf /usr/local/python3.7.5/bin/python3 /usr/local/bin/python && \ ln -sf /usr/local/python3.7.5/bin/python3 /usr/local/bin/python3 && \ ln -sf /usr/local/python3.7.5/bin/pip3 /usr/bin/pip && \ ln -sf /usr/local/python3.7.5/bin/pip3 /usr/bin/pip3 && \ cd .. && \ rm -rf Python*ENV LD_LIBRARY_PATH=/usr/local/python3.7.5/lib:$LD_LIBRARY_PATHENV PATH=/usr/local/python3.7.5/bin:$PATH # 配置python pip源RUN mkdir -p ~/.pip && \ echo '[global]\n\ index-url = https://repo.huaweicloud.com/repository/pypi/simple/\n\ trusted-host = repo.huaweicloud.com\n\ timeout = 120' >> ~/.pip/pip.conf# HwHiAiUser, hwMindXRUN useradd -d /home/hwMindX -u 9000 -m -s /bin/bash hwMindX && \ useradd -d /home/HwHiAiUser -u 1000 -m -s /bin/bash HwHiAiUser && \ usermod -a -G HwHiAiUser hwMindX# 安装python包RUN pip3 install -U pip && \ pip3 install wheel && \ pip3 install setuptools&& \ pip3 install matplotlib && \ pip3 install opencv-python && \ pip3 install sklearn && \ pip3 install pandas && \ pip3 install pycocotools && \ pip3 install tables && \ pip3 install mmcv && \ pip3 install lxml && \ pip3 install easydict && \ pip3 install jupyter && \ pip3 install jupyterlab && \ pip3 install backports.lzma && \ pip3 install keras-preprocessing && \ pip3 install six && \ rm -rf /root/.cache/pip# 拷贝相关文件COPY . ./# Ascend包RUN bash $INSTALL_ASCEND_PKGS_SH# 环境变量ENV GLOG_v=2ENV TBE_IMPL_PATH=$NNAE_PATH/opp/op_impl/built-in/ai_core/tbeENV FWK_PYTHON_PATH=$NNAE_PATH/fwkacllib/python/site-packagesENV PATH=$NNAE_PATH/fwkacllib/ccec_compiler/bin/:$PATHENV ASCEND_OPP_PATH=$NNAE_PATH/oppENV PYTHONPATH=$HOST_ASCEND_BASE/tfplugin/latest/tfplugin/python/site-packages:\$FWK_PYTHON_PATH:\$FWK_PYTHON_PATH/auto_tune.egg:\$FWK_PYTHON_PATH/schedule_search.egg:\$TBE_IMPL_PATH:\$PYTHONPATHENV LD_LIBRARY_PATH=$NNAE_PATH/fwkacllib/lib64:\/usr/local/Ascend/driver/lib64/common/:\/usr/local/Ascend/driver/lib64/driver/:\/usr/local/Ascend/add-ons/:\/usr/local/Ascend/driver/tools/hccn_tool/:\$LD_LIBRARY_PATH# Bazel安装RUN git config --global http.proxy socks5://172.168.3.3:7891 && \ git config --global https.proxy socks5://172.168.3.3:7891ENV http_proxy=http://172.168.3.3:7890ENV https_proxy=https://172.168.3.3:7890RUN mkdir bazel-3.7.2-dist && \ unzip bazel-3.7.2-dist.zip -d bazel-3.7.2-dist && \ cd bazel-3.7.2-dist && \ env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh && \ cp output/bazel /usr/local/bin# TensorFlow安装ENV PYTHON_BIN_PATH=/usr/local/python3.7.5/bin/python3ENV PYTHON_LIB_PATH=/usr/local/python3.7.5/lib/python3.7/site-packagesRUN cd tensorflow-2.6.5 && \ ./configure && \ cp /tmp/.tf_configure.bazelrc . && \ bazel build --jobs=190 --local_ram_resources=204800 //tensorflow/tools/pip_package:build_pip_package && \ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkgENV http_proxy=ENV https_proxy=RUN pip3 install /tmp/tensorflow_pkg/tensorflow*.whl# NPU_Device安装RUN pip3 install --upgrade numpyRUN pip3 listENV ADAPTER_TARGET_PYTHON_PATH=/usr/local/python3.7.5/bin/python3ENV ASCEND_INSTALLED_PATH=/usr/local/Ascend/ascend-toolkit/latestRUN cd tensorflow/tf_adapter_2.x && \ ./configure && \ mkdir build && \ cd build && \ cmake .. && \ make -j 96 && \ pip3 install --upgrade ./dist/python/dist/npu_device-0.1-py3-none-any.whl# 收尾工作RUN pip3 install tensorflow-io# 清理工作RUN rm -f /etc/ascend_install.info && \ rm -rf /tmp/*报错截图:已在Gitee上提issue,暂未收到回复:导入npu_device报错undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringERKNS_15OpKernelContextEb · Issue #I5X6OH · Ascend/tensorflow - Gitee.com
  • [问题求助] 请问acl/acl_tdt.h文件位于哪个软件包?
    这几天在构建支持tensorflow 2.6.5版本的镜像,需要编译tf_adapter_2.x,报fatal error: acl/acl_tdt.h: No such file or directory查阅CMakeLists.txt文件发现该文件应该位于/usr/local/Ascend/runtime下面,但是我的环境下(已经安装了nnae和tfplugin)并没有此文件想知道这个文件是哪个软件包里带的?​​​tf_adapter_2.x · Ascend/tensorflow - 码云 - 开源中国 (gitee.com)
  • [问题求助] tensorflow 迁移Ascend910 问题 求助大佬!!
    求教各位大佬 !!!为啥我用自动迁移工具迁移tensorflow的程序 模型训练运行后只有 HBM在使用 为啥没使用到AI core呢??我的是一个LSTM的预测模型 调用不到AI -core 很困惑