Skip to content

Commit

Permalink
Update ADVANCED_TWEAKING.md
Browse files Browse the repository at this point in the history
fix doc about zero terminal snr
  • Loading branch information
victorchall authored Sep 25, 2024
1 parent 15eb3d8 commit 684849c
Showing 1 changed file with 1 addition and 5 deletions.
6 changes: 1 addition & 5 deletions doc/ADVANCED_TWEAKING.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,10 +201,6 @@ Test results: https://huggingface.co/panopstor/ff7r-stable-diffusion/blob/main/z

Very tentatively, I suggest closer to 0.10 for short term training, and lower values of around 0.02 to 0.03 for longer runs (50k+ steps). Early indications seem to suggest values like 0.10 can cause divergence over time.

## Zero terminal SNR

Set `zero_frequency_noise_ratio` to -1.

## Keeping images together (custom batching)

If you have a subset of your dataset that expresses the same style or concept, training quality may be improved by putting all of these images through the trainer together in the same batch or batches, instead of the default behaviour (which is to shuffle them randomly throughout the entire dataset).
Expand Down Expand Up @@ -338,4 +334,4 @@ Pyramid noise can be used to improve dynamic range in short finetunes of < 2000

The `attn_type` arg allows you to select `xformers`, `sdp`, or `slice`. Xformers uses the [xformers package](https://github.com/facebookresearch/xformers). SDP uses the scaled dot product mechanism [built into Pytorch](https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html) as of recent Pytorch updates. Slice uses head splitting. `sdp` is the default and suggested value as it seems to save a small amount of VRAM while also being approximately 5% faster than xformers. There is likely little reason to use slice or xformers but are kept for the time being for experimentation or consistency with prior experiments.

[Experimental results](https://discord.com/channels/1026983422431862825/1178007113151287306) (Discord link)
[Experimental results](https://discord.com/channels/1026983422431862825/1178007113151287306) (Discord link)

0 comments on commit 684849c

Please sign in to comment.