Skip to content
Change the repository type filter

All

    Repositories list

    • mmore

      Public
      Massive Multimodal Open RAG & Extraction A scalable multimodal pipeline for processing, indexing, and querying multimodal documents Ever needed to take 8000 PDFs, 2000 videos, and 500 spreadsheets and feed them to an LLM as a knowledge base? Well, MMORE is here to help you!
      Python
      Apache License 2.0
      420202Updated Dec 25, 2024Dec 25, 2024
    • llm-proxy

      Public
      LLM Serving and User Control
      Python
      2000Updated Dec 21, 2024Dec 21, 2024
    • lighteval

      Public
      Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
      Python
      MIT License
      109000Updated Dec 19, 2024Dec 19, 2024
    • Jupyter Notebook
      Apache License 2.0
      0000Updated Dec 19, 2024Dec 19, 2024
    • nanotron

      Public
      Minimalistic large language model 3D-parallelism training
      Python
      Apache License 2.0
      135719Updated Dec 2, 2024Dec 2, 2024
    • Containers for multimodal initiative (and maybe more across Swiss AI?)
      Dockerfile
      0000Updated Nov 29, 2024Nov 29, 2024
    • ml-4m

      Public
      4M: Massively Multimodal Masked Modeling (NeurIPS 2023 Spotlight)
      Python
      Apache License 2.0
      990134Updated Nov 29, 2024Nov 29, 2024
    • Tool set for data preparation and selection in the context of Swiss-AI (forked from DataTrove)
      Python
      Apache License 2.0
      158001Updated Nov 21, 2024Nov 21, 2024
    • Python
      Apache License 2.0
      0000Updated Nov 6, 2024Nov 6, 2024
    • A copy of nanotron for multilingual training
      Python
      Apache License 2.0
      135002Updated Oct 23, 2024Oct 23, 2024
    • Easily create large video dataset from video urls
      Python
      MIT License
      67101Updated Oct 14, 2024Oct 14, 2024
    • ml-4m-v2

      Public
      0000Updated Aug 5, 2024Aug 5, 2024
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      5k000Updated Jul 31, 2024Jul 31, 2024
    • PDF pipeline for creating training corpora (mainly for llm, multimodal and alignment horizontals)
      Python
      Apache License 2.0
      0000Updated May 8, 2024May 8, 2024
    • MoE

      Public
      some mixture of experts architecture implementations
      Python
      Apache License 2.0
      21210Updated Mar 22, 2024Mar 22, 2024
    • distributed trainer for LLMs
      Python
      Other
      78000Updated Feb 8, 2024Feb 8, 2024