Skip to content

A curated collection of resources, tools, frameworks, and information related to Generative AI

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENSE
Unknown
License.md
Notifications You must be signed in to change notification settings

onebirdrocks/Awesome-GenAI

Repository files navigation

Awesome-GenAI Awesome

Welcome to Awesome-GenAI! This repository is a curated collection of resources, tools, frameworks, and information related to Generative AI. Whether you are a beginner looking to learn about the basics or an experienced developer searching for the latest advancements in the field, this repository aims to provide valuable insights and resources to help you on your journey.

_If you want to contribute to this list (please do), send me a pull request or contact me. Also, a listed repository should be deprecated if:

  • Repository's owner explicitly says that "this library is not maintained".
  • Not committed for a long time (2~3 years).

What is Generative AI?

Generative AI refers to a class of AI algorithms that generate new content, such as text, images, and audio, based on the data they are trained on. These models can create realistic and innovative outputs, making them useful in various applications like content creation, design, and entertainment.

./What-is-GenAI

This diagram illustrates the hierarchical relationship between AI, Machine Learning, Deep Learning, GenAI, and LLM:

  • AI encompasses all technologies that simulate human intelligence.
  • Machine Learning is a subset of AI, emphasizing learning from data and algorithms.
  • Deep Learning is a subset of Machine Learning, utilizing multi-layer neural networks to handle complex data.
  • GenAI (Generative AI) is an application of Deep Learning, focusing on generating new data.
  • LLM (Large Language Models) is a branch of GenAI, specifically large-scale neural networks that generate and understand natural language text.

Table of Contents

  • Introduction
    • Overview of the project
    • Basic concepts of Generative AI
    • Contribution guidelines
  • Learning Resources
    • Online courses
    • Books and papers
    • Tutorials
    • Blogs
    • Workshops and conferences
  • Tools and Frameworks
    • Development Frameworks
    • Open-source projects
  • Models
    • Pre-trained models
    • Natural Language Processing models
    • Computer Vision models
    • Multimodal models
  • Applications
    • Text generation
    • Image generation
    • Audio generation
    • Video generation
    • Other applications
  • Datasets
    • Available datasets
    • Data collection and processing methods
    • Data augmentation techniques
  • Research and Papers
    • Latest research updates
    • Important paper reviews
    • Research trends and hotspots
  • Community and Events
    • Online and offline communities
    • Events and conferences
    • Projects and Case Studies
    • Success stories
    • Project showcases
    • Practical experience sharing
  • Miscellaneous
    • Paid tutorials
    • Paid services

Learning Resources

Paper

Tutorials

blogs

books

  • Build a Large Language Model (From Scratch) - [Free Chapters]
  • Generative AI in Action - [Free Chapters]

Models

Open Models

Benchmark

  • llm-colosseum - Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM. Forks Stars

Tools & Frameworks

ML

  • Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration. Forks Stars
  • Tensorflow - An Open Source Machine Learning Framework for Everyone. Forks Stars
  • MLX - An array framework for Apple silicon. Forks Stars

Development Frameworks

  • Langchain - Build context-aware reasoning applications. Forks Stars
  • LamaIndex - LlamaIndex is a data framework for your LLM applications. Forks Stars
  • Flowise - Drag & drop UI to build your customized LLM flow. Forks Stars
  • AutoGen - A programming framework for agentic AI. Forks Stars
  • Auto-GPT - AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters. Forks Stars
  • AgentGPT - Assemble, configure, and deploy autonomous AI Agents in your browser. Forks Stars
  • dify - Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production. Forks Stars
  • DB-GPT - AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents. Forks Stars
  • AutoDev - AutoDev: The AI-powered coding wizard with multilingual support 🌐, auto code generation 🏗️, and a helpful bug-slaying assistant ! Customizable prompts and a magic Auto Dev/Testing/Document/Agent feature included! Forks Stars
  • AgentKit - Starter-kit to build constrained agents with Nextjs, FastAPI and Langchain. Forks Stars
  • GraphRAG - A modular graph-based Retrieval-Augmented Generation (RAG) system. Forks Stars

Open-source projects

TTS

  • Whisper - Robust Speech Recognition via Large-Scale Weak Supervision. Forks Stars
  • Whisper Streamming - Whisper realtime streaming for long speech-to-text transcription and translation. Forks Stars
  • Faster Whisper - Faster Whisper transcription with CTranslate2. Forks Stars
  • OpenVoice - a versatile instant voice cloning approach that requires only a short audio clip from the reference speaker to replicate their voice and generate speech in multiple languages. Forks Stars
  • ChatTTS - A generative speech model for daily dialogue. Forks Stars
  • Coqui TTS - A deep learning toolkit for Text-to-Speech, battle-tested in research and production. Forks Stars
  • Coqui STT Models - Open models for Coqui STT. Forks Stars
  • RealtimeTTS - https://github.com/KoljaB/RealtimeTTS. Forks Stars
  • MockingBird - Clone a voice in 5 seconds to generate arbitrary speech in real-time. Forks Stars
  • GPT-SoVITS -1 min voice data can also be used to train a good TTS model! (few shot voice cloning).
  • EmotiVoice - https://github.com/netease-youdao/EmotiVoice. Forks Stars
  • NeMo - A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech). Forks Stars
  • Vits - Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech. Forks Stars
  • tacotron - A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial). Forks Stars
  • tacotron2 - PyTorch implementation with faster-than-realtime inference. Forks Stars
  • FastSpeech - An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech. Forks Stars
  • VALL-E-X - An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Forks Stars
  • SenseVoice - Multilingual Voice Understanding Model. Forks Stars
  • CosyVoice - Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Clothing(Visual Try on)

  • IDM-VTON - IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild. Forks Stars
  • MagicClothing - Official implementation of Magic Clothing: Controllable Garment-Driven Image Synthesis. Forks Stars
  • StableVITON - [CVPR2024] StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On. Forks Stars
  • HR Viton - Official PyTorch implementation for the paper High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions (ECCV 2022). Forks Stars
  • Dressing in order - (ICCV'21) Official code of "Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-on and Outfit Editing" by Aiyu Cui, Daniel McKee and Svetlana Lazebnik Forks Stars
  • Dress Code - Dress Code: High-Resolution Multi-Category Virtual Try-On. ECCV 2022. Forks Stars

Agent

  • BabyAGI - Python script that acts as an AI-powered task manager. Forks Stars
  • SWE-agent - SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.47% of bugs in the SWE-bench evaluation set and takes just 1 minute to run. Forks Stars

Virtual Human

MuseV - MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising by Tencent Forks Stars

Text2SQL

  • Vanna - Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG. Forks Stars
  • SQLCoder - SoTA LLM for converting natural language questions to SQL queries. Forks Stars
  • SQLChat - Chat-based SQL Client and Editor for the next decade. Forks Stars
  • Dataherald - Interact with your SQL database, Natural Language to SQL using LLMs. Forks Stars
  • WrenAI - Wren AI makes your database RAG-ready. Implement Text-to-SQL more accurately and securely. Forks Stars

Deep Fake

  • SadTalker - SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation. Forks Stars
  • Facefusion - Next generation face swapper and enhancer. Forks Stars
  • Ghost - A new one shot face swap approach for image and video domains. Forks Stars
  • - [SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild. Forks Stars
  • Fay - Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guides, broadcasters, assistants, waiters, teachers, and voice or text-based mobile assistants. Forks Stars

About

A curated collection of resources, tools, frameworks, and information related to Generative AI

Resources

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENSE
Unknown
License.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages