Replies: 1 comment
-
Yup, TensortRT support is on the radar for sure. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
First let me start with how much I love these coordinated updates and setting of what whould be prioritized next. This is the hallmark of a professional developer who has experience and also knows how to communicate. When I first installed this fork this was like magic for me. Things work out, no bugs, even increases in it/s after optimizing the settings. This is legendary.
I know that for the probably tedious TensorRT integration, it is needed, that models are re-compiled on the local GPU. I've read up on about the problems that 8GB GPUs have a problem with re-compiling the model to work on itself. There is not enough VRAM for the compilation process for even 8GB cards, so this could be a problem. I dont know if it could be possible to delegate some capacities to RAM / CPU, because I am not a professional developer.
Another solution would be to have a TensorRT model database, where people with the same chipset can upload compiled models to use with TensorRT. The same chipset does not necessarily mean, that it is the same VRAM. A 24GB card could pre-compile models for cards with less or equal than 16GB VRAM. I havent researched the specifics of the used version of chips and CUDA versions etc. but I think this would be a possible solution.
The gained increase in generation speed is too large to ignore, in my opinion. Making a 2070 into a 3090 by applying these optimizations is just too much of a change to brush it off. If this is somehow possible, I would readily be willing to donate regularly to this project. I already wanted to donate but I am saving up for a 4090 right now, so this is kind of a conflict of interest lol.
Beta Was this translation helpful? Give feedback.
All reactions