Dreambooth on Windows 12G显存3060快速训练(fine tune) Stable diffusuion

Dreambooth可以根据少至几张，多至无穷张图片，来精调stable diffusion的模型，训练效果优于embedding和hypernetwork。原训练方法需要24G现存，经过优化后的脚本可以在12G的3060上训练，并且本文将在windows使用的方法记录下来，且无需wsl。

本文使用CCRcmcpe/diffusers分支，支持arb变分辨率训练，无需将图片裁剪至512*512，可直接将任意比例的原图拿来训练。

可以训练stable diffusion 1.5，naifu，anything3.0等所有模型。

首先是环境，安装python3.10，建议使用miniconda的环境。一定要python3.10。下面二选一

miniconda：https://docs.conda.io/en/latest/miniconda.html
官方python：https://www.python.org/downloads/

进入python的cmd环境，构建一个虚拟环境。

python.exe -m pip install --upgrade pip
pip install virtualenv
virtualenv vwin
vwin\Scripts\activate

下载训练脚本

https://github.com/CCRcmcpe/diffusers/archive/748f64e47cd6fe3ebe5e6fe7011ee90c5a672fd3.zip

解压到diffusers目录，然后进入目录，安装依赖

cd diffusers
pip install -e .
cd examples\dreambooth
pip install -U -r requirements.txt
pip install OmegaConf
pip install pytorch_lightning
pip install einops
pip install bitsandbytes==0.34
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

魔改bitsandbytes库让他支持windows。

下载：https://github.com/DeXtmL/bitsandbytes-win-prebuilt/archive/refs/heads/main.zip
将libbitsandbytes_cuda116.dll 文件手动拷贝到工作目录下的 venv_diffusers\Lib\site-packages\bitsandbytes 中，位于 libbitsandbytes_cuda116.so 的旁边。
将 cextension.py: https://pastebin.com/jjgxuh8V 覆盖到vwin\Lib\site-packages\bitsandbytes目录。
将 main.py: https://pastebin.com/BsEzpdpw 覆盖到vwin\Lib\site-packages\bitsandbytes\cuda_setup目录。

安装xformers。

下载：https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/f/xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl
pip install xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl

设置accelerate

accelerate config

In which compute environment are you running? ([0] This machine, [1] AWS (Amazon SageMaker)): 0
Which type of machine are you using? ([0] No distributed training, [1] multi-CPU, [2] multi-GPU, [3] TPU [4] MPS): 0
Do you want to run your training on CPU only (even if a GPU is available)? [yes/NO]:NO
Do you want to use DeepSpeed? [yes/NO]:NO
Do you wish to use FP16 or BF16 (mixed precision)? [NO/fp16/bf16]: fp16

大功告成，可以准备数据集，开始训练了。下面脚本的具体路径根据实际情况修改。在vwin的目录下执行。

首次执行可能需要将ckpt文件拆包。

下载模型描述：https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-inference.yaml
python diffusers\scripts\convert_original_stable_diffusion_to_diffusers.py --checkpoint_path model.ckpt --original_config_file v1-inference.yaml --scheduler_type ddim --dump_path models/diffusers_model

以后每次执行下面的命令来训练。LR(lerning rate)通常设置1e-6到5e-6。

根据经验，LR=2e-6，100张图训练40epoch或4000步左右模型效果最佳，训练过多会过拟合。

set MODEL_NAME="models/diffusers_model"
set INSTANCE_DIR="your_dataset_image_folder"
set OUTPUT_DIR="where_do_you_want_to_generate_your_model"
set INSTANCE_PROMPT="prompt of your dataset"
set SAVE_SAMPLE_PROMPT="What prompt to generate samples"
set WANDB_PROJECT="DB-SD1.5"
set LR=2e-6
set EPOCH=100
set SAVE_INTERVAL=10
set BATCH=2

accelerate launch diffusers\examples\dreambooth\train_dreambooth.py --pretrained_model_name_or_path=%MODEL_NAME% --pretrained_vae_name_or_path=%MODEL_NAME%\vae --instance_data_dir=%INSTANCE_DIR% --output_dir=%OUTPUT_DIR% --instance_prompt=%INSTANCE_PROMPT% --resolution=512 --train_batch_size=%BATCH% --gradient_accumulation_steps=1 --learning_rate=%LR% --lr_scheduler="constant" --lr_warmup_steps=0 --num_train_epochs=%EPOCH% --save_interval_epochs=%SAVE_INTERVAL% --mixed_precision="fp16" --optimizer="adamw_8bit" --wandb --wandb_project=%WANDB_PROJECT% --use_aspect_ratio_bucket --not_cache_latents --save_unet_half --seed=23333 --save_sample_prompt=%SAVE_SAMPLE_PROMPT% --n_save_sample=1

如果能够训练，但在保存后爆显存，可能是因为生成预览图的时候显存不足。可不生成预览。修改：

diffusers\examples\dreambooth\train_dreambooth.py
563行 if save_sample:
改为 if save_sample and args.seed!=23333:

训练完打包ckpt：

python diffusers\scripts\convert_diffusers_to_original_stable_diffusion.py --model_path models/resultModel --checkpoint_path result.ckpt --half

我一般批量：

set MODEL_PATH=JFTBO60
set OUTPUT_PATH=E:\model
set MODEL_NAME=trained_
set START=541
set END=5400

set F1=%START%
set /a F2=%START%-1
set /a F3=%END%

for /l %%i in (%F1%,%F2%,%F3%) do python diffusers\scripts\convert_diffusers_to_original_stable_diffusion.py --model_path %MODEL_PATH%\%%i --checkpoint_path %OUTPUT_PATH%\%MODEL_NAME%%%i.ckpt --unet_half
python diffusers\scripts\convert_diffusers_to_original_stable_diffusion.py --model_path %OUTPUT_PATH%\%END% --checkpoint_path %OUTPUT_PATH%\%MODEL_NAME%%END%.ckpt --unet_half
pause

生成的ckpt就可以用了。如果在AUTOMATIC1111/stable-diffusion-webui中载入生成的图片全是花的，有可能时vae未载入。解决方法挑选一个：

1.先载入原模型再载入训练的模型
2.复制（或者hard link）原模型的vae，修改名称和新模型一样
3.webui设置中指定载入的vae

参考：

Dreambooth on Windows ：https://gist.github.com/Summersoff/70861d757a40c153c5802dc8c4ed68c0

bitsandbytes issue：https://github.com/TimDettmers/bitsandbytes/issues/30#issuecomment-1257676341

标签： 3060 dreambooth

熊酱杂记

熊酱杂记

Dreambooth on Windows 12G显存3060快速训练(fine tune) Stable diffusuion