Skip to content

Commit

Permalink
🔥[SparseInfer] SparseInfer: Training-free Prediction of Activation Sp…
Browse files Browse the repository at this point in the history
…arsity for Fast LLM Inference (#100)
  • Loading branch information
DefTruth authored Nov 25, 2024
1 parent 01a14af commit 40292d7
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ Awesome-LLM-Inference: A curated list of [📙Awesome LLM Inference Papers with
|2024.07|🔥[DynamoLLM] DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency(@Microsoft Azure Research)| [[pdf]](https://arxiv.org/pdf/2408.00741)|⚠️|⭐️ |
|2024.08|🔥[NanoFlow] NanoFlow: Towards Optimal Large Language Model Serving Throughput(@University of Washington)| [[pdf]](https://arxiv.org/pdf/2408.12757)|[[Nanoflow]](https://github.com/efeslab/Nanoflow) ![](https://img.shields.io/github/stars/efeslab/Nanoflow.svg?style=social)|⭐️⭐️ |
|2024.08|🔥[**Decentralized LLM**] Decentralized LLM Inference over Edge Networks with Energy Harvesting(@Padova)| [[pdf]](https://arxiv.org/pdf/2408.15907)|⚠️|⭐️ |
|2024.11| 🔥[**SparseInfer**] SparseInfer: Training-free Prediction of Activation Sparsity for Fast LLM Inference(@University of Seoul, etc)|[[pdf]](https://arxiv.org/pdf/2411.12692)|⚠️|⭐️ |

### 📖Continuous/In-flight Batching ([©️back👆🏻](#paperlist))
<div id="Continuous-In-flight-Batching"></div>
Expand Down

0 comments on commit 40292d7

Please sign in to comment.