Skip to content

v2.6.6

Compare
Choose a tag to compare
@DefTruth DefTruth released this 25 Nov 03:22
· 2 commits to main since this release
40292d7

What's Changed

  • Add code link to BPT by @DefTruth in #95
  • add vAttention code link by @KevinZeng08 in #96
  • 🔥[SageAttention] SAGEATTENTION: ACCURATE 8-BIT ATTENTION FOR PLUG-AND-PLAY INFERENCE ACCELERATION(@thu-ml) by @DefTruth in #97
  • 🔥[SageAttention-2] SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration(@thu-ml) by @DefTruth in #98
  • 🔥[Squeezed Attention] SQUEEZED ATTENTION: Accelerating Long Context Length LLM Inference(@uc Berkeley) by @DefTruth in #99
  • 🔥[SparseInfer] SparseInfer: Training-free Prediction of Activation Sparsity for Fast LLM Inference by @DefTruth in #100

New Contributors

Full Changelog: v2.6.5...v2.6.6