Merge branch 'main' into readme-tests

Lightning-AI · Apr 22, 2024 · 880c264 · 880c264
2 parents 7a06e62 + 54628ec
commit 880c264
Show file tree

Hide file tree

Showing 17 changed files with 819 additions and 59 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,5 @@
 <div align="center">
-<img src="https://pl-public-data.s3.amazonaws.com/assets_lightning/LitStableLM_Badge.png" alt="LitGPT" width="128"/>
 
-&nbsp;
 
 # ⚡ LitGPT
 
@@ -20,11 +18,11 @@ Uses the latest state-of-the-art techniques:
 <p align="center">
   <a href="https://lightning.ai/">Lightning AI</a> •
   <a href="#choose-from-20-llms">Models</a> •
-  <a href="#install-litgpt">Install</a> •
-  <a href="#get-started">Get started</a> •
-  <a href="#use-an-llm">Evaluate</a> •
+  <a href="#quick-start">Quick start</a> •
+  <a href="#use-an-llm-for-inference">Inference</a> •
   <a href="#finetune-an-llm">Finetune</a> •
   <a href="#finetune-an-llm">Pretrain</a> •
+    <a href="#deploy-an-llm">Deploy</a> •
   <a href="#state-of-the-art-features">Features</a> •
   <a href="#training-recipes">Training recipes (YAML)</a>
 </p>
@@ -34,13 +32,13 @@ Uses the latest state-of-the-art techniques:
 &nbsp;
 
 # Finetune, pretrain and deploy LLMs Lightning fast ⚡⚡   
-LitGPT is a command-line tool designed to easily [finetune](#finetune-an-llm), [pretrain](#pretrain-an-llm), [evaluate](#use-an-llm), and deploy [20+ LLMs](#choose-from-20-llms) **on your own data**. It features highly-optimized [training recipes](#training-recipes) for the world's most powerful open-source large-language-models (LLMs).
+LitGPT is a command-line tool designed to easily [finetune](#finetune-an-llm), [pretrain](#pretrain-an-llm), [evaluate](#use-an-llm), and [deploy](#deploy-an-llm) [20+ LLMs](#choose-from-20-llms) **on your own data**. It features highly-optimized [training recipes](#training-recipes) for the world's most powerful open-source large language models (LLMs).
 
 We reimplemented all model architectures and training recipes from scratch for 4 reasons:   
 
 1. Remove all abstraction layers and have single file implementations.   
 2. Guarantee Apache 2.0 compliance to enable enterprise use without limits.    
-3. Optimized each model architectural detail to maximize performance, reduce costs, and speed up training.    
+3. Optimized each model's architectural detail to maximize performance, reduce costs, and speed up training.    
 4. Highly-optimized [recipe configs](#training-recipes) we have tested at enterprise scale.               
 
 ---
@@ -50,6 +48,7 @@ LitGPT has 🤯 **custom, from-scratch implementations** of [20+ LLMs](tutorials
 
 | Model | Model size | Author | Reference |
 |----|----|----|----|
+| Llama 3 | 8B, 70B | Meta AI | [Meta AI 2024](https://github.com/meta-llama/llama3)                                                                     |
 | Llama 2 | 7B, 13B, 70B | Meta AI | [Touvron et al. 2023](https://arxiv.org/abs/2307.09288)                                                                      |
 | Code Llama | 7B, 13B, 34B, 70B | Meta AI | [Rozière et al. 2023](https://arxiv.org/abs/2308.12950) |
 | Mistral | 7B | Mistral AI | [Mistral website](https://mistral.ai/)                                                                                       |
@@ -72,6 +71,7 @@ LitGPT has 🤯 **custom, from-scratch implementations** of [20+ LLMs](tutorials
 | Function Calling Llama 2 | 7B | Trelis | [Trelis et al. 2023](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2)                                   |
 | Gemma | 2B, 7B | Google | [Google Team, Google Deepmind](https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf)                         |
 | Llama 2 | 7B, 13B, 70B | Meta AI | [Touvron et al. 2023](https://arxiv.org/abs/2307.09288)                                                                      |
+| Llama 3 | 8B, 70B | Meta AI | [Meta AI 2024](https://github.com/meta-llama/llama3)                                                                     |
 | LongChat | 7B, 13B | LMSYS | [LongChat Team 2023](https://lmsys.org/blog/2023-06-29-longchat/)                                                            |
 | Mistral | 7B | Mistral AI | [Mistral website](https://mistral.ai/)                                                                                       |
 | Nous-Hermes | 7B, 13B, 70B | NousResearch | [Org page](https://huggingface.co/NousResearch)                                                                              |
@@ -114,21 +114,22 @@ pip install -e '.[all]'
 
 ---
 
-# Get started
+# Quick start
 After installing LitGPT, select the model and action you want to take on that model (finetune, pretrain, evaluate, deploy, etc...):    
 
 ```bash
 # ligpt [action] [model]
-litgpt  download  mistralai/Mistral-7B-Instruct-v0.2
-litgpt  chat      mistralai/Mistral-7B-Instruct-v0.2
-litgpt  finetune  mistralai/Mistral-7B-Instruct-v0.2    
-litgpt  pretrain  mistralai/Mistral-7B-Instruct-v0.2    
-litgpt  serve     mistralai/Mistral-7B-Instruct-v0.2    
+litgpt  download  meta-llama/Meta-Llama-3-8B-Instruct
+litgpt  chat      meta-llama/Meta-Llama-3-8B-Instruct
+litgpt  finetune  meta-llama/Meta-Llama-3-8B-Instruct    
+litgpt  pretrain  meta-llama/Meta-Llama-3-8B-Instruct    
+litgpt  serve     meta-llama/Meta-Llama-3-8B-Instruct    
 ```
 
 &nbsp;
 
-###  Use an LLM
+###  Use an LLM for inference
+Use LLMs for inference to test its chatting capabilities, run evaluations, or extract embeddings, etc...     
 Here's an example showing how to use the Mistral 7B LLM.
 
 <a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-chat">
@@ -155,14 +156,20 @@ For more information, refer to the [download](tutorials/download_model_weights.m
 ### Finetune an LLM
 [Finetune](tutorials/finetune.md) a model to specialize it on your own custom dataset:
 
+<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-finetune">
+  <img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/studio-badge.svg" alt="Open In Studio"/>
+</a>
+
+&nbsp; 
+
 ```bash
 # 1) Download a pretrained model
 litgpt download --repo_id microsoft/phi-2
 
 # 2) Finetune the model
-curl -L https://huggingface.co/datasets/medalpaca/medical_meadow_health_advice/raw/main/medical_meadow_health_advice.json -o my_custom_dataset.json
+curl -L https://huggingface.co/datasets/ksaw008/finance_alpaca/resolve/main/finance_alpaca.json -o my_custom_dataset.json
 
-litgpt finetune lora \
+litgpt finetune \
   --checkpoint_dir checkpoints/microsoft/phi-2 \
   --data JSON \
   --data.json_path my_custom_dataset.json \
@@ -174,9 +181,17 @@ litgpt chat \
   --checkpoint_dir out/phi-2-lora/final
 ```
 
+&nbsp;
+
 ### Pretrain an LLM   
 Train an LLM from scratch on your own data via pretraining:
 
+<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-pretrain">
+<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/studio-badge.svg"; alt="Open In Studio"/>
+</a>
+
+&nbsp; 
+
 ```bash
 mkdir -p custom_texts
 curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt
@@ -201,10 +216,19 @@ litgpt chat \
   --checkpoint_dir out/custom-model/final
 ```
 
+&nbsp;
+
 ### Continue pretraining an LLM       
-This is another way of finetuning that specialize an already pretrained model by training on custom data:    
+This is another way of finetuning that specializes an already pretrained model by training on custom data:    
 
-```
+
+<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-continue-pretraining">
+<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/studio-badge.svg"; alt="Open In Studio"/>
+</a>
+
+&nbsp; 
+
+```bash
 mkdir -p custom_texts
 curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt
 curl https://www.gutenberg.org/cache/epub/26393/pg26393.txt --output custom_texts/book2.txt
@@ -215,6 +239,7 @@ litgpt download --repo_id EleutherAI/pythia-160m
 # 2) Continue pretraining the model
 litgpt pretrain \
   --model_name pythia-160m \
+  --tokenizer_dir checkpoints/EleutherAI/pythia-160m \
   --initial_checkpoint_dir checkpoints/EleutherAI/pythia-160m \
   --data TextFiles \
   --data.train_data_path "custom_texts/" \
@@ -228,6 +253,37 @@ litgpt chat \
 
 &nbsp;
 
+### Deploy an LLM
+Once you're ready to deploy a finetuned LLM, run this command:   
+
+<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-serve">
+  <img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/studio-badge.svg" alt="Open In Studio"/>
+</a>
+
+&nbsp;
+
+```bash
+# locate the checkpoint to your finetuned or pretrained model and call the `serve` command:  
+litgpt serve --checkpoint_dir path/to/your/checkpoint/microsoft/phi-2
+
+# Alternative: if you haven't finetuned, download any checkpoint to deploy it:     
+litgpt download --repo_id microsoft/phi-2
+litgpt serve --checkpoint_dir checkpoints/microsoft/phi-2
+```
+
+Test the server in a separate terminal and integrate the model API into your AI product:    
+```python
+# 3) Use the server (in a separate session)
+import requests, json
+ response = requests.post(
+     "http://127.0.0.1:8000/predict", 
+     json={"prompt": "Fix typos in the following sentence: Exampel input"}
+)
+print(response.json()["output"])
+```
+
+&nbsp;
+
 > [!NOTE]
 > **[Read the full docs](tutorials/0_to_litgpt.md)**.
 
@@ -267,7 +323,7 @@ Browse all training recipes [here](config_hub).
 ### Example
 
 ```bash
-litgpt finetune lora \
+litgpt finetune \
   --config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/finetune/llama-2-7b/lora.yaml
 ```
 
@@ -422,7 +478,7 @@ seed: 1337
 Override any parameter in the CLI:
 
 ```bash
-litgpt finetune lora \
+litgpt finetune \
   --config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/finetune/llama-2-7b/lora.yaml \
   --lora_r 4
 ```

diff --git a/config_hub/finetune/README.md b/config_hub/finetune/README.md
@@ -22,6 +22,10 @@ For more information, see the [Dealing with out-of-memory (OOM) errors](../../tu
 | llama-2-7b/qlora.yaml             | 7B   | Alpaca 2k | 4      | 0.814    | 13.68 GB    | 512            | 2                | bfloat16  | 45.68 min (A10G)   |
 | llama-2-7b/full.yaml              | 7B   | Alpaca 2k | 1      | 0.941    | 26.81 GB    | 512            | 4                | bfloat16  | 1.78 min (4xA100)  |
 |                                   |      |           |        |          |             |                |                  |           |                    |
+| llama-3-8b/lora.yaml              | 8B   | Alpaca 2k | 2      | 0.890    | 19.73 GB    | 512            | 1                | bfloat16  | 14.80 min (A10G)   |
+| llama-3-8b/qlora.yaml             | 8B   | Alpaca 2k | 2      | 0.941    | 17.41 GB    | 512            | 2                | bfloat16  | 22.34 min (A10G)   |
+| llama-3-8b/full.yaml              | 8B   | Alpaca 2k | 1      | 1.451    | 35.48 GB    | 512            | 4                | bfloat16  | 2.14 min (4xA100)  |
+|                                   |      |           |        |          |             |                |                  |           |                    |
 | mistral-7b/lora.yaml  (v0.1)      | 7B   | Alpaca 2k | 4      | 0.796    | 20.65 GB    | 512            | 2                | bfloat16  | 31.04 min (1xA10G) |
 | mistral-7b/qlora.yaml (v0.1)      | 7B   | Alpaca 2k | 4      | 0.803    | 14.29 GB    | 512            | 2                | bfloat16  | 44.69 min (1xA10G) |
 |                                   |      |           |        |          |             |                |                  |           |                    |

diff --git a/config_hub/finetune/llama-3-8b/full.yaml b/config_hub/finetune/llama-3-8b/full.yaml
@@ -0,0 +1,95 @@
+
+# The path to the base model's checkpoint directory to load for finetuning. (type: <class 'Path'>, default: checkpoints/stabilityai/stablelm-base-alpha-3b)
+checkpoint_dir: checkpoints/meta-llama/Meta-Llama-3-8B
+
+# Directory in which to save checkpoints and logs. (type: <class 'Path'>, default: out/finetune/full)
+out_dir: out/finetune/full-llama-3-8b
+
+# The precision to use for finetuning. Possible choices: "bf16-true", "bf16-mixed", "32-true". (type: Optional[str], default: null)
+precision: bf16-true
+
+# How many devices/GPUs to use (type: Union[int, str], default: 1)
+devices: 4
+
+# Path to a checkpoint directory to resume from in case training was interrupted, or ``True`` to resume
+# from the latest checkpoint in ``out_dir``. (type: Union[bool, Path], default: False)
+resume: false
+
+# Data-related arguments. If not provided, the default is ``litgpt.data.Alpaca``.
+data:
+  class_path: litgpt.data.Alpaca2k
+  init_args:
+    mask_prompt: false
+    prompt_style: alpaca
+    ignore_index: -100
+    seed: 42
+    num_workers: 4
+
+# Training-related arguments. See ``litgpt.args.TrainArgs`` for details
+train:
+
+  # Number of optimizer steps between saving checkpoints (type: Optional[int], default: 1000)
+  save_interval: 200
+
+  # Number of iterations between logging calls (type: int, default: 1)
+  log_interval: 1
+
+  # Number of samples between optimizer steps across data-parallel ranks (type: int, default: 64)
+  global_batch_size: 64
+
+  # Number of samples per data-parallel rank (type: int, default: 1)
+  micro_batch_size: 4
+
+  # Number of iterations with learning rate warmup active (type: int, default: 100)
+  lr_warmup_steps: 25
+
+  # Number of epochs to train on (type: Optional[int], default: 5)
+  epochs: 1
+
+  # Total number of tokens to train on (type: Optional[int], default: null)
+  max_tokens:
+
+  # Limits the number of optimizer steps to run. (type: Optional[int], default: null)
+  max_steps:
+
+  # Limits the length of samples. Off by default (type: Optional[int], default: null)
+  max_seq_length: 512
+
+  # Whether to tie the embedding weights with the language modeling head weights. (type: Optional[bool], default: null)
+  tie_embeddings:
+
+  #   (type: float, default: 0.003)
+  learning_rate: 0.0002
+
+  #   (type: float, default: 0.02)
+  weight_decay: 0.1
+
+  #   (type: float, default: 0.9)
+  beta1: 0.9
+
+  #   (type: float, default: 0.95)
+  beta2: 0.95
+
+  #   (type: Optional[float], default: null)
+  max_norm:
+
+  #   (type: float, default: 6e-05)
+  min_lr: 6.0e-05
+
+# Evaluation-related arguments. See ``litgpt.args.EvalArgs`` for details
+eval:
+
+  # Number of optimizer steps between evaluation calls (type: int, default: 600)
+  interval: 25
+
+  # Number of tokens to generate (type: Optional[int], default: 100)
+  max_new_tokens: 100
+
+  # Number of iterations (type: int, default: 100)
+  max_iters: 100
+
+# The name of the logger to send metrics to. (type: Literal['wandb', 'tensorboard', 'csv'], default: csv)
+logger_name: csv
+
+# The random seed to use for reproducibility. (type: int, default: 1337)
+seed: 1337