您对华为云开发者网站的整体评价?

非常不满意 非常满意

0

1

2

3

4

5

6

7

8

9

10

*您遇到了哪些问题?(最多选三项)
*您感到满意的原因是?(最多选三项)
*请针对您所遇到的问题给出具体的反馈
200/200

Notebook
Dreambooth
Stable Diffusion 模型微调
HouYanSong
10个月以前
0KB 271 10
  • 许可证类型 ? CC0: Public Domain
  • 标签
    计算机视觉图像生成GPU训练GPU推理
  • 资产ID c341a5b5-5d41-4ed2-8b21-823c5fc5fcee

描述

Dreambooth

Stable Diffusion 模型微调

Dreambooth是谷歌发布的一种通过向模型注入自定义的主题来fine-tune diffusion model的技术,可以生成不同场景下的图片

🔹 本案例需使用 Pytorch-2.0.1 GPU-V100 及以上规格运行

🔹 点击Run in ModelArts,将会进入到ModelArts CodeLab中,这时需要你登录华为云账号,如果没有账号,则需要注册一个,且要进行实名认证,参考《ModelArts准备工作_简易版》 即可完成账号注册和实名认证。 登录之后,等待片刻,即可进入到CodeLab的运行环境

🔹 出现 Out Of Memory ,请检查是否为您的参数配置过高导致,修改参数配置,重启kernel或更换更高规格资源进行规避❗❗❗

下载代码和模型

import os
import moxing as mox

if not os.path.exists('tools'):
    mox.file.copy_parallel('obs://modelbox-course/dreambooth/tools', 'tools')
    
if not os.path.exists('sd2.1'):
    mox.file.copy_parallel('obs://modelbox-course/dreambooth/sd2.1', 'sd2.1')
    
if not os.path.exists('sd1.5'):
    mox.file.copy_parallel('obs://modelbox-course/dreambooth/sd1.5', 'sd1.5')
    
if not os.path.exists('vae-ft-mse'):
    mox.file.copy_parallel('obs://modelbox-course/dreambooth/vae-ft-mse', 'vae-ft-mse')
    
if not os.path.exists('frpc_linux_amd64'):
    mox.file.copy_parallel('obs://modelarts-labs-bj4-v2/course/ModelBox/frpc_linux_amd64', '/home/ma-user/work/frpc_linux_amd64')

配置运行环境

本案例依赖Python3.10.10及以上环境,因此我们首先创建虚拟环境:

!/home/ma-user/anaconda3/bin/conda clean -i
!/home/ma-user/anaconda3/bin/conda create -n python-3.10.10 python=3.10.10 -y --override-channels --channel https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
!/home/ma-user/anaconda3/envs/python-3.10.10/bin/pip install ipykernel
import json
import os

data = {
   "display_name": "python-3.10.10",
   "env": {
      "PATH": "/home/ma-user/anaconda3/envs/python-3.10.10/bin:/home/ma-user/anaconda3/envs/python-3.7.10/bin:/modelarts/authoring/notebook-conda/bin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/ma-user/modelarts/ma-cli/bin:/home/ma-user/modelarts/ma-cli/bin:/home/ma-user/anaconda3/envs/PyTorch-1.8/bin"
   },
   "language": "python",
   "argv": [
      "/home/ma-user/anaconda3/envs/python-3.10.10/bin/python",
      "-m",
      "ipykernel",
      "-f",
      "{connection_file}"
   ]
}

if not os.path.exists("/home/ma-user/anaconda3/share/jupyter/kernels/python-3.10.10/"):
    os.mkdir("/home/ma-user/anaconda3/share/jupyter/kernels/python-3.10.10/")

with open('/home/ma-user/anaconda3/share/jupyter/kernels/python-3.10.10/kernel.json', 'w') as f:
    json.dump(data, f, indent=4)
conda env list

创建完成后,稍等片刻,或刷新页面,点击右上角kernel选择python-3.10.10

!python -V
Python 3.10.10
!nvidia-smi

安装依赖库

!pip install --upgrade pip
!pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
!pip install matplotlib pillow diffusers==0.21.2 accelerate transformers ftfy bitsandbytes==0.35.0 natsort safetensors xformers==0.0.22 gradio==4.0.2 -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn
!cp /home/ma-user/work/frpc_linux_amd64 /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages/gradio/frpc_linux_amd64_v0.2
!chmod +x /home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages/gradio/frpc_linux_amd64_v0.2

模型训练

import os
import json
import random

下载数据集并上传wh11e.zip压缩包:总共包含6张人物图像(庄达菲),这里也可以替换为自己的照片(3~10张)

if not os.path.exists('training_data'):
    os.mkdir('training_data')
    os.system('unzip -q wh11e.zip -d training_data')
def show_imgs(imgs, imgs_num):
    # 设置图片大小
    plt.figure(figsize=(16, 16))
    for i in range(imgs_num):
        # 如果超过4张图片,就显示多行
        plt.subplot((imgs_num+4)//4, 4, i+1)
        plt.title(f"image {i+1}")
        plt.imshow(imgs[i])
        plt.axis('off')
    plt.show()

查看数据集

from PIL import Image
from matplotlib import pyplot as plt

imgs_path = [os.path.join('training_data/wh11e', img) for img in os.listdir('training_data/wh11e')]
imgs_num = len(imgs_path)
imgs = [Image.open(img) for img in imgs_path]
show_imgs(imgs, imgs_num)

训练配置

# 唯一标识符:unique_id
unique_id = "wh11e"
# 类别名称:人
class_name = "person"

# 基于v2.1的模型
model_sd = "sd2.1"

# 基于v1.5的模型
model_sd = "sd1.5"

# 模型输出保存位置
output_dir = "dreambooth_wh11e"
# 配置json
concepts_list = [
    {
        "instance_prompt": "a photo of wh11e person", 
        "class_prompt": "photo of a person",
        "instance_data_dir": "./training_data/wh11e", # 将训练集放在这里
        "class_data_dir": "./training_data/person" # 会在这里生成辅助数据
    }
]
# 将配置写入json
with open("./training_data/concepts_list.json", "w") as f:
  json.dump(concepts_list, f, indent=4)
# 创建目录
!mkdir -p $output_dir
for c in concepts_list:
    os.makedirs(c["instance_data_dir"], exist_ok=True)
    os.makedirs(c["class_data_dir"], exist_ok=True)

训练参数

# 训练集图片数量
num_imgs = imgs_num
# 默认参数
num_class_images = num_imgs * 12
# 最大训练步数
max_num_steps = num_imgs * 100
# 学习率
learning_rate = 1e-6 
# 学习率预热步数
lr_warmup_steps = int(max_num_steps / 10)

启动训练

# --pretrained_model_name_or_path: 模型路径,这里使用我下载的离线权重SD1.5
# --pretrained_vae_name_or_path: vae路径,这里使用我下载的离线权重
# --output_dir: 输出路径
# --resolution: 分辨率
# --save_sample_prompt: 保存样本的提示语
# --concepts_list: 配置json路径

!python3 ./tools/train_dreambooth.py \
  --pretrained_model_name_or_path=$model_sd \
  --pretrained_vae_name_or_path="vae-ft-mse" \
  --output_dir=$output_dir \
  --revision="fp16" \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --seed=777 \
  --resolution=512 \
  --train_batch_size=1 \
  --train_text_encoder \
  --mixed_precision="fp16" \
  --use_8bit_adam \
  --gradient_accumulation_steps=1 \
  --learning_rate=$learning_rate \
  --lr_scheduler="constant" \
  --lr_warmup_steps=80 \
  --num_class_images=$num_class_images \
  --sample_batch_size=4 \
  --max_train_steps=$max_num_steps \
  --save_interval=10000 \
  --save_sample_prompt="a photo of wh11e person" \
  --concepts_list="./training_data/concepts_list.json"
from natsort import natsorted
from glob import glob

# 查看模型输出的样本
saved_weights_dir = natsorted(glob(output_dir + os.sep + '*'))[-1]

saved_weights_dir
'dreambooth_wh11e/600'
# 绘制显示
plt.figure(figsize=(16, 6))
for i in range(4):
    img = plt.imread(saved_weights_dir + os.sep + 'samples' + os.sep + f'{i}.png')
    plt.subplot(1, 4, i + 1)
    plt.imshow(img)
    plt.axis('off')

模型推理

# 导入相关的库
import torch #PyTorch
from diffusers import StableDiffusionPipeline
/home/ma-user/anaconda3/envs/python-3.10.10/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
# 加载模型
pipe = StableDiffusionPipeline.from_pretrained(saved_weights_dir, torch_dtype=torch.float16)
# 配置GPU
pipe = pipe.to('cuda')
pipe.enable_attention_slicing() # 开启注意力切片,节约显存
pipe.enable_xformers_memory_efficient_attention() # 开启Xformers的内存优化注意力,节约显存
Loading pipeline components...: 100%|██████████| 6/6 [00:06<00:00,  1.10s/it]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
# 更换scheduler
from diffusers import DDIMScheduler
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
# 同一个prompt生成多张图片
prompt = "face portrait of wh11e person in the snow, realistic, hd, vivid, sunset" # wh11e在雪地里的人像,逼真,高清,生动,日落
negative_prompt = "bad anatomy, ugly, deformed, desfigured, distorted face, poorly drawn hands, poorly drawn face, poorly drawn feet, blurry, low quality, low definition, lowres, out of frame, out of image, cropped, cut off, signature, watermark"
num_samples = 4
guidance_scale = 7.5
num_inference_steps = 30
height = 512
width = 512

imgs = pipe(
    prompt,
    negative_prompt=negative_prompt,
    height=height, width=width,
    num_images_per_prompt=num_samples,
    num_inference_steps=num_inference_steps,
    guidance_scale=guidance_scale,
    ).images

show_imgs(imgs, num_samples)
100%|██████████| 30/30 [00:05<00:00,  5.05it/s]
# 同时使用多个prompt

prompt = ["photo of wh11e person, closeup, mountain fuji in the background, natural lighting", # wh11e的照片,特写,背景是富士山,自然光
          "photo of wh11e person in the desert, closeup, pyramids in the background, natural lighting, frontal face", # wh11e的照片,特写,背景是金字塔,自然光,正脸
          "photo of wh11e person in the forest, natural lighting, frontal face", # wh11e的照片,背景是森林,自然光,正脸
          "photo of wh11e person as an astronaut, natural lighting, frontal face, closeup, starry sky in the background", # wh11e的照片,作为宇航员,自然光,正脸,特写,背景是星空
          "face portrait of wh11e in the snow, realistic, hd, vivid, sunset", # wh11e在雪地里的人像,逼真,高清,生动,日落
          "digital painting of wh11e in the snow, realistic, hd, vivid, sunset", # wh11e在雪地里的数字油画,逼真,高清,生动,日落
          "watercolor painting of wh11e person, realistic, blue and orange tones", # wh11e的水彩画,逼真,蓝色和橙色调
          "digital painting of wh11e person, hyperrealistic, fantasy, Surrealist, painted by Alphonse Mucha", # wh11e的数字油画,超逼真,幻想,超现实主义,阿方斯·缪夏绘制
          "painting of wh11e person in star wars, realistic, 4k ultra hd, blue and red tones", # wh11e在星球大战中的画作,逼真,4k超高清,蓝色和红色调
          "photo of wh11e person, in an armor, realistic, visible face, colored, detailed face, ultra detailed, natural lighting", # wh11e的照片,穿着盔甲,逼真,可见脸,彩色,详细的脸,超详细,自然光
          "photo of wh11e person, cyberpunk, vivid, realistic, 4k ultra hd", # wh11e的照片,赛博朋克,生动,逼真,4k超高清
          "a painting of wh11e person, realistic, by Van Gogh,", # wh11e的画作,逼真,由梵高绘制
          ]


negative_prompt = ["bad anatomy, ugly, deformed, desfigured, distorted face, poorly drawn hands, poorly drawn face, poorly drawn feet, blurry, low quality, low definition, lowres, out of frame, out of image, cropped, cut off, signature, watermark" ] * len(prompt)
num_samples = 1
guidance_scale = 7.5
num_inference_steps = 30
height = 512
width = 512


imgs = pipe(
    prompt,
    negative_prompt=negative_prompt,
    height=height, width=width,
    num_images_per_prompt=num_samples,
    num_inference_steps=num_inference_steps,
    guidance_scale=guidance_scale,
).images

# 遍历显示imgs
show_imgs(imgs, len(prompt))
100%|██████████| 30/30 [00:14<00:00,  2.01it/s]

Gradio App

import torch 
import numpy as np
import gradio as gr
from diffusers import StableDiffusionPipeline

# 加载模型
pipe = StableDiffusionPipeline.from_pretrained(saved_weights_dir, torch_dtype=torch.float16)
# 配置GPU
pipe = pipe.to('cuda')
pipe.enable_attention_slicing() # 开启注意力切片,节约显存
pipe.enable_xformers_memory_efficient_attention() # 开启Xformers的内存优化注意力,节约显存
# 更换scheduler
from diffusers import DDIMScheduler
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)

negative_prompt = "bad anatomy, ugly, deformed, desfigured, distorted face, poorly drawn hands, poorly drawn face, poorly drawn feet, blurry, low quality, low definition, lowres, out of frame, out of image, cropped, cut off, signature, watermark"
num_samples = 1
guidance_scale = 7.5
num_inference_steps = 30
height = 512
width = 512

def generate_image(prompt, steps):
    image = pipe(prompt,
                 output_type='numpy',
                 negative_prompt=negative_prompt,
                 height=height, width=width,
                 num_images_per_prompt=num_samples,
                 num_inference_steps=steps,
                 guidance_scale=guidance_scale
                 ).images
    image = np.uint8(image[0] * 255)
    return image

with gr.Blocks() as demo:
    gr.HTML("""<h1 align="center">Dreambooth</h1>""")
    with gr.Tab("Generate Image"):
        with gr.Row():
            with gr.Column():
                text_input = gr.Textbox(value="a photo of wh11e person", label="prompts", lines=4)
                steps = gr.Slider(30, 50, step=1, label="steps")
                gr.Examples(
                examples=[
                    ["face portrait of wh11e in the snow, realistic, hd, vivid, sunset"],
                    ["photo of wh11e person, closeup, mountain fuji in the background, natural lighting"],
                    ["photo of wh11e person in the desert, closeup, pyramids in the background, natural lighting, frontal face"]
                ],
                inputs=[text_input]
            )
            image_output = gr.Image(height=400, width=400)
        
    image_button = gr.Button("submit")
    image_button.click(generate_image, [text_input, steps], [image_output])
    
demo.launch(share=True)
Loading pipeline components...: 100%|██████████| 6/6 [00:01<00:00,  4.09it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .

交付

华为云ModelArts

华北-北京四

限制

公开

版本

版本号
版本ID
发布时间
发布状态
版本说明
8.0.0
E9o4Rw
2024-05-10 19:21
已完成
--

若您怀疑合法知识产权遭受侵犯,可以通过此链接进行投诉与建议。