Skip to content
Change the repository type filter

All

    Repositories list

    • General Information, model certifications, and benchmarks for nm-vllm enterprise distributions
      1710Updated Nov 28, 2024Nov 28, 2024
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      4.7k6016Updated Nov 27, 2024Nov 27, 2024
    • A safetensors extension to efficiently store sparse quantized tensors on disk
      Python
      Apache License 2.0
      251111Updated Nov 27, 2024Nov 27, 2024
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      1.9k201Updated Nov 27, 2024Nov 27, 2024
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      27k100Updated Nov 26, 2024Nov 26, 2024
    • docs

      Public
      Top-level directory for documentation and general content
      MDX
      712004Updated Nov 25, 2024Nov 25, 2024
    • Fast and memory-efficient exact attention
      C++
      BSD 3-Clause "New" or "Revised" License
      1.4k000Updated Nov 23, 2024Nov 23, 2024
    • Python
      4000Updated Nov 21, 2024Nov 21, 2024
    • evalplus

      Public
      NeuralMagic fork of EvalPlus (Rigourous evaluation of LLM-synthesized code - NeurIPS 2023)
      Python
      Apache License 2.0
      110000Updated Nov 21, 2024Nov 21, 2024
    • Neural Magic GHA
      Python
      Apache License 2.0
      0003Updated Nov 15, 2024Nov 15, 2024
    • graphs

      Public
      Apache License 2.0
      0000Updated Nov 15, 2024Nov 15, 2024
    • Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
      Python
      Apache License 2.0
      63000Updated Nov 14, 2024Nov 14, 2024
    • An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
      Jupyter Notebook
      Apache License 2.0
      245000Updated Nov 12, 2024Nov 12, 2024
    • guidellm

      Public
      Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
      Python
      Apache License 2.0
      12168109Updated Nov 5, 2024Nov 5, 2024
    • yolov5

      Public
      YOLOv5 in PyTorch > ONNX > CoreML > TFLite
      Python
      GNU General Public License v3.0
      16k2002Updated Oct 31, 2024Oct 31, 2024
    • LLM training code for MosaicML foundation models
      Python
      Apache License 2.0
      531000Updated Oct 24, 2024Oct 24, 2024
    • nm-vllm

      Public archive
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Other
      4.7k25300Updated Oct 11, 2024Oct 11, 2024
    • mteb

      Public
      MTEB: Massive Text Embedding Benchmark
      Jupyter Notebook
      Apache License 2.0
      276001Updated Oct 2, 2024Oct 2, 2024
    • 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
      Python
      Apache License 2.0
      27k9013Updated Oct 1, 2024Oct 1, 2024
    • AutoFP8

      Public
      Python
      Apache License 2.0
      20158103Updated Oct 1, 2024Oct 1, 2024
    • OmniQuant

      Public
      [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
      Python
      MIT License
      56001Updated Sep 27, 2024Sep 27, 2024
    • Benchmarking code for running quantized kernels from vLLM and other libraries
      Python
      0100Updated Sep 24, 2024Sep 24, 2024
    • An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
      Python
      MIT License
      488000Updated Sep 16, 2024Sep 16, 2024
    • Supercharge Your Model Training
      Python
      Apache License 2.0
      421000Updated Aug 27, 2024Aug 27, 2024
    • MixEval

      Public
      NM fork of MixEval compatible with SparseAutoModel.
      Python
      36001Updated Aug 20, 2024Aug 20, 2024
    • mamba

      Public
      Mamba SSM architecture
      Python
      Apache License 2.0
      1.1k000Updated Aug 12, 2024Aug 12, 2024
    • Causal depthwise conv1d in CUDA, with a PyTorch interface
      Cuda
      BSD 3-Clause "New" or "Revised" License
      61000Updated Aug 8, 2024Aug 8, 2024
    • sparseml

      Public
      Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
      Python
      Apache License 2.0
      1482.1k760Updated Aug 1, 2024Aug 1, 2024
    • inference

      Public
      Reference implementations of MLPerf™ inference benchmarks
      Python
      Apache License 2.0
      536101Updated Jul 24, 2024Jul 24, 2024
    • examples

      Public
      Notebooks using the Neural Magic libraries 📓
      Jupyter Notebook
      74103Updated Jul 24, 2024Jul 24, 2024