From e3e951148c14feb7f6dd782f7985350f9e2d86cc Mon Sep 17 00:00:00 2001
From: William Falcon <waf2107@columbia.edu>
Date: Thu, 4 Apr 2024 19:36:49 -0400
Subject: [PATCH] Update README.md

---
 README.md | 169 ++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 113 insertions(+), 56 deletions(-)
diff --git a/README.md b/README.md
index c06e792578..eb7fdb3b5c 100644
--- a/README.md
+++ b/README.md
@@ -14,20 +14,108 @@
 ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pytorch-lightning)
 ![cpu-tests](https://github.com/lightning-AI/lit-stablelm/actions/workflows/cpu-tests.yml/badge.svg) [![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/Lightning-AI/lit-stablelm/blob/master/LICENSE) [![Discord](https://img.shields.io/discord/1077906959069626439?style=plastic)](https://discord.gg/VptPCZkGNa)
 
+<p align="center">
+  <a href="https://lightning.ai/">Lightning.ai</a> •
+  <a href="#install-litgpt">Install</a> •
+  <a href="#get-started">Get started</a> •
+  <a href="#choose-from-20-llms">Models</a> •
+  <a href="#state-of-the-art-features">Features</a> •
+  <a href="#project-templates">Templates</a> •
+  <a href="#finetune">Finetune</a> •
+  <a href="#pretrain">Pretrain</a> •
+  <a href="#use-optimized-configurations">YAML configs</a> •
+  <a href="#litgpt-design-principles">Design principles</a> •
+  <a href="#acknowledgements">Acknowledgements</a>
+</p>
+
 </div>
 
 &nbsp;
 
-⚡ LitGPT is a hackable [implementation](litgpt/model.py) of state-of-the-art open-source large language models released under the **Apache 2.0 license**.
+⚡ Pretrain, finetune, deploy over 20 LLMs. Uses state-of-the-art techniques like flash attention, 4-bit, LoRA, and more.
+
+## Install LitGPT
+
+Install LitGPT with all dependencies (including CLI, quantization, tokenizers for all models, etc.):
+
+```bash
+pip install 'litgpt[all]'
+```
+
+<details>
+  <summary>Advanced install options</summary>
+  
+Install from source:
+
+```bash
+git clone https://github.com/Lightning-AI/litgpt
+cd litgpt
+pip install -e '.[all]'
+```
+</details>
+
+&nbsp;
+
+## Get started
+
+LitGPT is a command-line tool. Here's an example showing how to use the Mistral 7B LLM.
+
+```bash
+# 1) Download a pretrained model
+litgpt download --repo_id mistralai/Mistral-7B-Instruct-v0.2
+
+# 2) Chat with the model
+litgpt chat \
+  --checkpoint_dir checkpoints/mistralai/Mistral-7B-Instruct-v0.2
+
+>> Prompt: What do Llamas eat?
+```
+
+For more information, refer to the [download](tutorials/download_model_weights.md) and [inference](tutorials/inference.md) tutorials.
+
 
 &nbsp;
-## LitGPT supports
 
-✅ &nbsp;[The latest model weights](tutorials/download_model_weights.md): Gemma, Mistral, Mixtral, Phi 2, Llama 2, Falcon, CodeLlama, and [many more](tutorials/download_model_weights.md).
+> [!NOTE]
+> We recommend starting with the **[Zero to LitGPT: Getting Started with Pretraining, Finetuning, and Using LLMs](tutorials/0_to_litgpt.md)** if you are looking to get started with using LitGPT.
 
-✅ &nbsp;Optimized and efficient code: Flash Attention v2, multi-GPU support via fully-sharded data parallelism, [optional CPU offloading](tutorials/oom.md#do-sharding-across-multiple-gpus), and [TPU and XLA support](extensions/xla).
 
-✅ &nbsp;[Pretraining](tutorials/pretrain.md), [finetuning](tutorials/finetune.md), and [inference](tutorials/inference.md) in various precision settings: FP32, FP16, BF16, and FP16/FP32 mixed.
+&nbsp;
+# Choose from 20 LLMs
+
+✅ &nbsp;View the [full list](tutorials/download_model_weights.md).
+
+| Model                                        | Model size                               | Reference                                                                                                                    |
+|----------------------------------------------|------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
+| Code Llama by Meta AI                        | 7B, 13B, 34B, 70B                        | [Rozière et al. 2023](https://arxiv.org/abs/2308.12950)                                                                      |
+| Dolly by Databricks                          | 3B, 7B, 12B                              | [Conover et al. 2023](https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm) |
+| Falcon by TII UAE                            | 7B, 40B, 180B                            | [TII 2023](https://falconllm.tii.ae)                                                                                         |
+| FreeWilly2 (Stable Beluga 2) by Stability AI | 70B                                      | [Stability AI 2023](https://stability.ai/blog/stable-beluga-large-instruction-fine-tuned-models)                             |
+| Function Calling Llama 2 by Trelis           | 7B                                       | [Trelis et al. 2023](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)                                   |
+| Gemma by Google                              | 2B, 7B                                   | [Google Team, Google Deepmind](https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf)                         |
+| Llama 2 by Meta AI                           | 7B, 13B, 70B                             | [Touvron et al. 2023](https://arxiv.org/abs/2307.09288)                                                                      |
+| LongChat by LMSYS                            | 7B, 13B                                  | [LongChat Team 2023](https://lmsys.org/blog/2023-06-29-longchat/)                                                            |
+| Mistral and Mixtral by Mistral AI            | 7B                                       | [Mistral website](https://mistral.ai/)                                                                                       |
+| Nous-Hermes by NousResearch                  | 7B, 13B, 70B                             | [Org page](https://huggingface.co/NousResearch)                                                                              |
+| OpenLLaMA by OpenLM Research                 | 3B, 7B, 13B                              | [Geng & Liu 2023](https://github.com/openlm-research/open_llama)                                                             |
+| Phi by Microsoft Research                    | 1.3B, 2.7B                               | [Li et al. 2023](https://arxiv.org/abs/2309.05463)                                                                           |
+| Platypus by Lee at el.                       | 7B, 13B, 70B                             | [Lee, Hunter, and Ruiz 2023](https://arxiv.org/abs/2308.07317)                                                               |
+| Pythia by EleutherAI                         | {14,31,70,160,410}M, {1,1.4,2.8,6.9,12}B | [Biderman et al. 2023](https://arxiv.org/abs/2304.01373)                                                                     |
+| RedPajama-INCITE by Together                 | 3B, 7B                                   | [Together 2023](https://together.ai/blog/redpajama-models-v1)                                                                |
+| StableCode by Stability AI                   | 3B                                       | [Stability AI 2023](https://stability.ai/blog/stablecode-llm-generative-ai-coding)                                           |
+| StableLM by Stability AI                     | 3B, 7B                                   | [Stability AI 2023](https://github.com/Stability-AI/StableLM)                                                                |
+| StableLM Zephyr by Stability AI              | 3B                                       | [Stability AI 2023](https://stability.ai/blog/stablecode-llm-generative-ai-coding)                                           |
+| TinyLlama by Zhang et al.                    | 1.1B                                     | [Zhang et al. 2023](https://github.com/jzhang38/TinyLlama)                                                                   |
+| Vicuna by LMSYS                              | 7B, 13B, 33B                             | [Li et al. 2023](https://lmsys.org/blog/2023-03-30-vicuna/)     
+
+&nbsp;    
+
+## State-of-the-art features
+✅ &nbsp;State-of-the-art optimizations: Flash Attention v2, multi-GPU support via fully-sharded data parallelism, [optional CPU offloading](tutorials/oom.md#do-sharding-across-multiple-gpus), and [TPU and XLA support](extensions/xla).
+
+✅ &nbsp;[Pretrain](tutorials/pretrain.md), [finetune](tutorials/finetune.md), and [deploy](tutorials/inference.md) 
+
+✅ &nbsp;Various precision settings: FP32, FP16, BF16, and FP16/FP32 mixed.
 
 ✅ &nbsp;[Configuration files](config_hub) for great out-of-the-box performance.
 
@@ -62,66 +150,37 @@ The following [Lightning Studio](https://lightning.ai/lightning-ai/studios) temp
 <br>
 &nbsp;
 
-
-
-## Installing LitGPT
-
-You can install LitGPT with all dependencies (including CLI, quantization, tokenizers for all models, etc.) using the following pip command:
-
-```bash
-pip install 'litgpt[all]'
-```
-
-Alternatively, can install litgpt from a cloned GitHub repository:
-
-```bash
-git clone https://github.com/Lightning-AI/litgpt
-cd litgpt
-pip install -e '.[all]'
-```
-
-
 &nbsp;
 
-## Using LitGPT
-
-
-Below is a minimal example to get started with the LitGPT command line interface (CLI), illustrating how to download and use a model:
+# Finetune
 
+[Finetune](tutorials/finetune.md) a model to specialize it on a custom dataset. Here's an example that finetunes phi-2:
 
 ```bash
 # 1) Download a pretrained model
-litgpt download --repo_id mistralai/Mistral-7B-Instruct-v0.2
+litgpt download --repo_id microsoft/phi-2
 
-# 2) Chat with the model
-litgpt chat \
-  --checkpoint_dir checkpoints/mistralai/Mistral-7B-Instruct-v0.2
+# 2) Finetune the model
+litgpt finetune lora \
+  --checkpoint_dir checkpoints/microsoft/phi-2 \
+  --data Alpaca2k \
+  --out_dir out/phi-2-lora
 
->> Prompt: What do Llamas eat?
+# 3) Chat with the model
+litgpt chat \
+  --checkpoint_dir out/phi-2-lora/final
 ```
 
-For more information, refer to the [download](tutorials/download_model_weights.md) and [inference](tutorials/inference.md) tutorials.
-
+# Pretrain
 
-&nbsp;
-
-> [!NOTE]
-> We recommend starting with the **[Zero to LitGPT: Getting Started with Pretraining, Finetuning, and Using LLMs](tutorials/0_to_litgpt.md)** if you are looking to get started with using LitGPT.
-
-
-
-&nbsp;
-
-## Finetuning and pretraining
-
-LitGPT supports [pretraining](tutorials/pretrain.md) and [finetuning](tutorials/finetune.md) to optimize models on excisting or custom datasets. Below is an example showing how to finetune a model with LoRA:
+Train an LLM from scratch by [pretraining](tutorials/pretrain.md) on your own data:
 
 ```bash
 # 1) Download a pretrained model
 litgpt download --repo_id microsoft/phi-2
 
 # 2) Finetune the model
-litgpt finetune lora \
+litgpt pretrain lora \
   --checkpoint_dir checkpoints/microsoft/phi-2 \
   --data Alpaca2k \
   --out_dir out/phi-2-lora
@@ -132,18 +191,16 @@ litgpt chat \
 ```
 
 &nbsp;
-## Configuration files for enhanced performance
-
-LitGPT also allows users to use configuration files in YAML format instead of specifying settings via the command line interface and comes with a set of model-specific defaults for good out-of-the-box performance:
+# Use optimized configurations
 
+LitGPT comes with out-of-the-box, highly performant settings via our YAML configs.
 
 ```bash
 litgpt finetune lora \
   --config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/finetune/llama-2-7b/lora.yaml
 ```
 
-For added convenience, you can also manually override config file setting via the CLI:
-
+Override any parameter in the CLI:    
 
 ```bash
 litgpt finetune lora \
@@ -151,7 +208,7 @@ litgpt finetune lora \
   --lora_r 4
 ```
 
-You can browse the available configuration files [here](config_hub).
+Browse the available configuration files [here](config_hub).   
 
 &nbsp;
 
@@ -164,7 +221,7 @@ You can browse the available configuration files [here](config_hub).
 <br>
 &nbsp;
 
-## Customization
+## Customize configs
 
 LitGPT supports rich and customizable [config files](config_hub) to tailor the LLM training to your dataset and hardware needs. Shown below is a configuration file for LoRA finetuning:
 
@@ -296,7 +353,7 @@ seed: 1337
 
 &nbsp;
 
-## LitGPT design principles
+# LitGPT design principles
 
 This repository follows the main principle of **openness through clarity**.