feat: force full gpu load; feat: expose `interrupt_current_processing` from comfyui #311

tazlin · 2024-08-25T13:32:30Z

After what feels like an eternity of troubleshooting, I have determined the following:

fix: force models loading to the GPU to be gpu-only
- ComfyUI was too likely to only partially load the models to the GPU. This had major performance implications. The comfyui hard-coded numbers which underpin the logic do not consider the worker use case and is overly cautious when speed is the chief consideration. I'm sure this change has the potential to increase worker instability but I have seen a 10%-15% increase in throughput on my 2080 when used via the worker with this change, so I am going to at least try and get this to work. There are a number of techniques the worker can use (manually clearing memory or even killing a process) to manage this, so I believe it should be possible to make this viable.
feat: aggressive_unloading for more regular garbage collection
- ComfyUI's default mode of running (main.py) includes calls to cleanup_models(...) and soft_empty_cache(...) on a timer. I suspect issues that have arisen lately are rooted in the fact that horde-engine does not currently do something similar. I am adding this (default on) option to the HordeLib class to call these after every pipeline run.

As with all cases of comments/docs, the details of the diatribe targeted at would-be developers have become somewhat inaccurate. However, the overall message is still roughly correct, as is the fact that problems would arise if it its warnings are not headed.

Member functions are not "FunctionTypes" (which are 'bound' to modules) from a static typing perspective. This updates the types to use `Callable` and to briefly explain their purposes so a would-be dev doesn't need to dig into comfy to get the implementation

When I introduced `logging` redirection to loguru, I also introduced the many hundreds/min irrelevant message to do with the comfyui internals of loading models in low vram conditions. I was original hesitant to remove these message, thinking that there may have been some potentially useful information, but as we are going to be forcing models to always fully load, these messages are now less important. There will still be a message which indicates if the 'full' load did not occur, so that will due for now.

Previous to this change, comfyui was too likely to only partially load the models to the GPU. This had **major** performance implications. The comfyui hard-coded numbers which underpin the logic doesn't consider the worker use case and is overly cautious when speed is the chief consideration. I'm sure this change has the potential to increase worker instability but I have seen a 10%-15% increase in throughput on my 2080 when used via the worker, so I am going to at least try and get this to work. There are a number of techniques the worker can use (manually clearing memory or even killing a process) to manage this, so I believe it should be possible to make this viable.

I am hoping to one day implement a graceful way to interrupt generations in comfyui, such as from a user-initiated abort. I suspect that exposing this function will get us a some percent of the way to accomplishing that goal. See also Haidra-Org/horde-worker-reGen#262.

ComfyUI's default mode of running (`main.py`) includes calls to `cleanup_models(...)` and `soft_empty_cache(...)` on a timer. I suspect issues that have arisen lately are rooted in the fact that horde-engine does not currently do something similar. I am adding this (default on) option to the `HordeLib` class to call these after every pipeline run.

tazlin · 2024-08-25T13:34:14Z

Also note that the dynamics discussed above do not apply to horde-engine versions 2.13.x. Recent changes to ComfyUI have introduced these (and numerous other) problems.

This is a patent "temporary" fix to prevent extremely large models, such as cascade models, from being forced to fully load. I will track the resolution of this in an issue: #312

db0 · 2024-08-25T17:57:32Z

Nice one. Can't wait to see how it works

This is an attempt to get the tests to run on high vram cards

tazlin added 6 commits August 25, 2024 08:32

fix: aggressive_unloading = True by default

3cb5049

tazlin mentioned this pull request Aug 25, 2024

'high' and/or 'low' VRAM usage modes #312

Open

fix: rely on comfyui internals for loading cascade series models

f0982cb

This is a patent "temporary" fix to prevent extremely large models, such as cascade models, from being forced to fully load. I will track the resolution of this in an issue: #312

tazlin added 3 commits August 25, 2024 16:15

fix: configurable models_not_to_force_load

90c7e1e

tests: run a sd15 job at 2048x during tests

fff2a78

tests: disable_smart_memory=True

808c9f2

This is an attempt to get the tests to run on high vram cards

tazlin merged commit 2d58136 into main Aug 26, 2024
1 of 2 checks passed

tazlin deleted the force-full-gpu-load branch August 26, 2024 20:27

This was referenced Aug 26, 2024

feat: Memory management improvements; force load SD15 models fully #318

Merged

feat: latest comfyui; fix: better GPU utilization for SD15 Haidra-Org/horde-worker-reGen#270

Merged

tazlin mentioned this pull request Sep 14, 2024

feat: Support for Flux #326

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: force full gpu load; feat: expose `interrupt_current_processing` from comfyui #311

feat: force full gpu load; feat: expose `interrupt_current_processing` from comfyui #311

tazlin commented Aug 25, 2024

tazlin commented Aug 25, 2024

db0 commented Aug 25, 2024

feat: force full gpu load; feat: expose interrupt_current_processing from comfyui #311

feat: force full gpu load; feat: expose interrupt_current_processing from comfyui #311

Conversation

tazlin commented Aug 25, 2024

tazlin commented Aug 25, 2024

db0 commented Aug 25, 2024

feat: force full gpu load; feat: expose `interrupt_current_processing` from comfyui #311

feat: force full gpu load; feat: expose `interrupt_current_processing` from comfyui #311