Popular repositories Loading
-
llm-compressor
llm-compressor PublicForked from vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Python
-
xoscar
xoscar PublicForked from xorbitsai/xoscar
Python actor framework for heterogeneous computing.
Python
-
inference
inference PublicForked from xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …
Python
-
mmengine
mmengine PublicForked from open-mmlab/mmengine
OpenMMLab Foundational Library for Training Deep Learning Models
Python
-
GPTQModel
GPTQModel PublicForked from ModelCloud/GPTQModel
GPTQ based LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
Python
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
If the problem persists, check the GitHub status page or contact support.