Drop GPTQ support #889

carmocca · 2024-01-18T18:22:36Z

Closes #582
Closes #583

GPTQ is inference only, requires a conversion step, and the implementation we use is much slower than bitsandbytes. The only upside is that it uses less memory at inference time.

There's a lot of research happening around inference quantization and having this implementation in the repo is not worth it anymore.

Andrei-Aksionov · 2024-01-18T18:32:16Z

Sorry for procrastinating with my AutoGPTQ integration, but I've started working on it (again) and there should be something to look at somewhere next week.
So the question is whether you are dropping GPTQ (and its variants) or just the current implementation?
If just the current implementation, then this PR doesn't close #583.

carmocca · 2024-01-18T18:58:54Z

The current implementation, however, as long as any new additions are not as useful and usable as existing implementations (for now it's only bnb) then we wouldn't be interested in adding them.

This PR should close #583. I suggest opening new issues proposing the addition of new techniques.

For instance https://github.com/IST-DASLab/marlin was released yesterday and includes its own GPTQ implementation. Perhaps AutoGPTQ is no longer the best alternative.

edit: they'll be adding marlin support with AutoGPTQ/AutoGPTQ#514

Andrei-Aksionov · 2024-01-19T09:03:37Z

Great!
Then I'll continue with AutoGPTQ.

Drop GPTQ support

5dfeec9

carmocca self-assigned this Jan 18, 2024

carmocca requested review from awaelchli and lantiga as code owners January 18, 2024 18:22

Undo accidental removal

bc583bc

carmocca merged commit 49c7e07 into main Jan 18, 2024
9 checks passed

carmocca deleted the carmocca/drop-gptq-support branch January 18, 2024 23:00

rasbt pushed a commit that referenced this pull request Mar 18, 2024

Drop GPTQ support (#889)

0ba20d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop GPTQ support #889

Drop GPTQ support #889

carmocca commented Jan 18, 2024 •

edited

Loading

Andrei-Aksionov commented Jan 18, 2024

carmocca commented Jan 18, 2024 •

edited

Loading

Andrei-Aksionov commented Jan 19, 2024

Drop GPTQ support #889

Drop GPTQ support #889

Conversation

carmocca commented Jan 18, 2024 • edited Loading

Andrei-Aksionov commented Jan 18, 2024

carmocca commented Jan 18, 2024 • edited Loading

Andrei-Aksionov commented Jan 19, 2024

carmocca commented Jan 18, 2024 •

edited

Loading

carmocca commented Jan 18, 2024 •

edited

Loading