-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: force full gpu load; feat: expose interrupt_current_processing
from comfyui
#311
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
As with all cases of comments/docs, the details of the diatribe targeted at would-be developers have become somewhat inaccurate. However, the overall message is still roughly correct, as is the fact that problems would arise if it its warnings are not headed.
Member functions are not "FunctionTypes" (which are 'bound' to modules) from a static typing perspective. This updates the types to use `Callable` and to briefly explain their purposes so a would-be dev doesn't need to dig into comfy to get the implementation
When I introduced `logging` redirection to loguru, I also introduced the many hundreds/min irrelevant message to do with the comfyui internals of loading models in low vram conditions. I was original hesitant to remove these message, thinking that there may have been some potentially useful information, but as we are going to be forcing models to always fully load, these messages are now less important. There will still be a message which indicates if the 'full' load did not occur, so that will due for now.
Previous to this change, comfyui was too likely to only partially load the models to the GPU. This had **major** performance implications. The comfyui hard-coded numbers which underpin the logic doesn't consider the worker use case and is overly cautious when speed is the chief consideration. I'm sure this change has the potential to increase worker instability but I have seen a 10%-15% increase in throughput on my 2080 when used via the worker, so I am going to at least try and get this to work. There are a number of techniques the worker can use (manually clearing memory or even killing a process) to manage this, so I believe it should be possible to make this viable.
I am hoping to one day implement a graceful way to interrupt generations in comfyui, such as from a user-initiated abort. I suspect that exposing this function will get us a some percent of the way to accomplishing that goal. See also Haidra-Org/horde-worker-reGen#262.
ComfyUI's default mode of running (`main.py`) includes calls to `cleanup_models(...)` and `soft_empty_cache(...)` on a timer. I suspect issues that have arisen lately are rooted in the fact that horde-engine does not currently do something similar. I am adding this (default on) option to the `HordeLib` class to call these after every pipeline run.
Also note that the dynamics discussed above do not apply to horde-engine versions 2.13.x. Recent changes to ComfyUI have introduced these (and numerous other) problems. |
This is a patent "temporary" fix to prevent extremely large models, such as cascade models, from being forced to fully load. I will track the resolution of this in an issue: #312
Nice one. Can't wait to see how it works |
This is an attempt to get the tests to run on high vram cards
This was referenced Aug 26, 2024
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After what feels like an eternity of troubleshooting, I have determined the following:
main.py
) includes calls tocleanup_models(...)
andsoft_empty_cache(...)
on a timer. I suspect issues that have arisen lately are rooted in the fact that horde-engine does not currently do something similar. I am adding this (default on) option to theHordeLib
class to call these after every pipeline run.