Release v0.1.6 · casper-hansen/AutoAWQ

What's Changed

Pseudo dequantize function by @casper-hansen in #127
CUDA 11.8.0 and 12.1.1 build by @casper-hansen in #128
AwqConfig class by @casper-hansen in #132
Fix init quant by @casper-hansen in #136
Update readme by @casper-hansen in #137
Benchmark info by @casper-hansen in #138
Bump to v0.1.6 by @casper-hansen in #139
CUDA 12 release by @casper-hansen in #140
Revert to previous version by @casper-hansen in #141
Fix performance regression by @casper-hansen in #148
[core / attention] Fix fused attention generation with newest transformers version by @younesbelkada in #146
Fix condition when rolling cache by @casper-hansen in #150
Default to safetensors for quantized models by @casper-hansen in #151
Create fused LlamaLikeModel by @casper-hansen in #152

Full Changelog: v0.1.5...v0.1.6