Skip to content

Commit

Permalink
Merge branch 'main' into readme-tests
Browse files Browse the repository at this point in the history
  • Loading branch information
rasbt authored Apr 22, 2024
2 parents 7a06e62 + 54628ec commit 880c264
Show file tree
Hide file tree
Showing 17 changed files with 819 additions and 59 deletions.
96 changes: 76 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
<div align="center">
<img src="https://pl-public-data.s3.amazonaws.com/assets_lightning/LitStableLM_Badge.png" alt="LitGPT" width="128"/>

&nbsp;

# ⚡ LitGPT

Expand All @@ -20,11 +18,11 @@ Uses the latest state-of-the-art techniques:
<p align="center">
<a href="https://lightning.ai/">Lightning AI</a> •
<a href="#choose-from-20-llms">Models</a> •
<a href="#install-litgpt">Install</a> •
<a href="#get-started">Get started</a> •
<a href="#use-an-llm">Evaluate</a> •
<a href="#quick-start">Quick start</a> •
<a href="#use-an-llm-for-inference">Inference</a> •
<a href="#finetune-an-llm">Finetune</a> •
<a href="#finetune-an-llm">Pretrain</a> •
<a href="#deploy-an-llm">Deploy</a> •
<a href="#state-of-the-art-features">Features</a> •
<a href="#training-recipes">Training recipes (YAML)</a>
</p>
Expand All @@ -34,13 +32,13 @@ Uses the latest state-of-the-art techniques:
&nbsp;

# Finetune, pretrain and deploy LLMs Lightning fast ⚡⚡
LitGPT is a command-line tool designed to easily [finetune](#finetune-an-llm), [pretrain](#pretrain-an-llm), [evaluate](#use-an-llm), and deploy [20+ LLMs](#choose-from-20-llms) **on your own data**. It features highly-optimized [training recipes](#training-recipes) for the world's most powerful open-source large-language-models (LLMs).
LitGPT is a command-line tool designed to easily [finetune](#finetune-an-llm), [pretrain](#pretrain-an-llm), [evaluate](#use-an-llm), and [deploy](#deploy-an-llm) [20+ LLMs](#choose-from-20-llms) **on your own data**. It features highly-optimized [training recipes](#training-recipes) for the world's most powerful open-source large language models (LLMs).

We reimplemented all model architectures and training recipes from scratch for 4 reasons:

1. Remove all abstraction layers and have single file implementations.
2. Guarantee Apache 2.0 compliance to enable enterprise use without limits.
3. Optimized each model architectural detail to maximize performance, reduce costs, and speed up training.
3. Optimized each model's architectural detail to maximize performance, reduce costs, and speed up training.
4. Highly-optimized [recipe configs](#training-recipes) we have tested at enterprise scale.

---
Expand All @@ -50,6 +48,7 @@ LitGPT has 🤯 **custom, from-scratch implementations** of [20+ LLMs](tutorials

| Model | Model size | Author | Reference |
|----|----|----|----|
| Llama 3 | 8B, 70B | Meta AI | [Meta AI 2024](https://github.com/meta-llama/llama3) |
| Llama 2 | 7B, 13B, 70B | Meta AI | [Touvron et al. 2023](https://arxiv.org/abs/2307.09288) |
| Code Llama | 7B, 13B, 34B, 70B | Meta AI | [Rozière et al. 2023](https://arxiv.org/abs/2308.12950) |
| Mistral | 7B | Mistral AI | [Mistral website](https://mistral.ai/) |
Expand All @@ -72,6 +71,7 @@ LitGPT has 🤯 **custom, from-scratch implementations** of [20+ LLMs](tutorials
| Function Calling Llama 2 | 7B | Trelis | [Trelis et al. 2023](https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v2) |
| Gemma | 2B, 7B | Google | [Google Team, Google Deepmind](https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf) |
| Llama 2 | 7B, 13B, 70B | Meta AI | [Touvron et al. 2023](https://arxiv.org/abs/2307.09288) |
| Llama 3 | 8B, 70B | Meta AI | [Meta AI 2024](https://github.com/meta-llama/llama3) |
| LongChat | 7B, 13B | LMSYS | [LongChat Team 2023](https://lmsys.org/blog/2023-06-29-longchat/) |
| Mistral | 7B | Mistral AI | [Mistral website](https://mistral.ai/) |
| Nous-Hermes | 7B, 13B, 70B | NousResearch | [Org page](https://huggingface.co/NousResearch) |
Expand Down Expand Up @@ -114,21 +114,22 @@ pip install -e '.[all]'

---

# Get started
# Quick start
After installing LitGPT, select the model and action you want to take on that model (finetune, pretrain, evaluate, deploy, etc...):

```bash
# ligpt [action] [model]
litgpt download mistralai/Mistral-7B-Instruct-v0.2
litgpt chat mistralai/Mistral-7B-Instruct-v0.2
litgpt finetune mistralai/Mistral-7B-Instruct-v0.2
litgpt pretrain mistralai/Mistral-7B-Instruct-v0.2
litgpt serve mistralai/Mistral-7B-Instruct-v0.2
litgpt download meta-llama/Meta-Llama-3-8B-Instruct
litgpt chat meta-llama/Meta-Llama-3-8B-Instruct
litgpt finetune meta-llama/Meta-Llama-3-8B-Instruct
litgpt pretrain meta-llama/Meta-Llama-3-8B-Instruct
litgpt serve meta-llama/Meta-Llama-3-8B-Instruct
```

&nbsp;

### Use an LLM
### Use an LLM for inference
Use LLMs for inference to test its chatting capabilities, run evaluations, or extract embeddings, etc...
Here's an example showing how to use the Mistral 7B LLM.

<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-chat">
Expand All @@ -155,14 +156,20 @@ For more information, refer to the [download](tutorials/download_model_weights.m
### Finetune an LLM
[Finetune](tutorials/finetune.md) a model to specialize it on your own custom dataset:

<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-finetune">
<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/studio-badge.svg" alt="Open In Studio"/>
</a>

&nbsp;

```bash
# 1) Download a pretrained model
litgpt download --repo_id microsoft/phi-2

# 2) Finetune the model
curl -L https://huggingface.co/datasets/medalpaca/medical_meadow_health_advice/raw/main/medical_meadow_health_advice.json -o my_custom_dataset.json
curl -L https://huggingface.co/datasets/ksaw008/finance_alpaca/resolve/main/finance_alpaca.json -o my_custom_dataset.json

litgpt finetune lora \
litgpt finetune \
--checkpoint_dir checkpoints/microsoft/phi-2 \
--data JSON \
--data.json_path my_custom_dataset.json \
Expand All @@ -174,9 +181,17 @@ litgpt chat \
--checkpoint_dir out/phi-2-lora/final
```

&nbsp;

### Pretrain an LLM
Train an LLM from scratch on your own data via pretraining:

<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-pretrain">
<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/studio-badge.svg"; alt="Open In Studio"/>
</a>

&nbsp;

```bash
mkdir -p custom_texts
curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt
Expand All @@ -201,10 +216,19 @@ litgpt chat \
--checkpoint_dir out/custom-model/final
```

&nbsp;

### Continue pretraining an LLM
This is another way of finetuning that specialize an already pretrained model by training on custom data:
This is another way of finetuning that specializes an already pretrained model by training on custom data:

```

<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-continue-pretraining">
<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/studio-badge.svg"; alt="Open In Studio"/>
</a>

&nbsp;

```bash
mkdir -p custom_texts
curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt
curl https://www.gutenberg.org/cache/epub/26393/pg26393.txt --output custom_texts/book2.txt
Expand All @@ -215,6 +239,7 @@ litgpt download --repo_id EleutherAI/pythia-160m
# 2) Continue pretraining the model
litgpt pretrain \
--model_name pythia-160m \
--tokenizer_dir checkpoints/EleutherAI/pythia-160m \
--initial_checkpoint_dir checkpoints/EleutherAI/pythia-160m \
--data TextFiles \
--data.train_data_path "custom_texts/" \
Expand All @@ -228,6 +253,37 @@ litgpt chat \

&nbsp;

### Deploy an LLM
Once you're ready to deploy a finetuned LLM, run this command:

<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litgpt-serve">
<img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/studio-badge.svg" alt="Open In Studio"/>
</a>

&nbsp;

```bash
# locate the checkpoint to your finetuned or pretrained model and call the `serve` command:
litgpt serve --checkpoint_dir path/to/your/checkpoint/microsoft/phi-2

# Alternative: if you haven't finetuned, download any checkpoint to deploy it:
litgpt download --repo_id microsoft/phi-2
litgpt serve --checkpoint_dir checkpoints/microsoft/phi-2
```

Test the server in a separate terminal and integrate the model API into your AI product:
```python
# 3) Use the server (in a separate session)
import requests, json
response = requests.post(
"http://127.0.0.1:8000/predict",
json={"prompt": "Fix typos in the following sentence: Exampel input"}
)
print(response.json()["output"])
```

&nbsp;

> [!NOTE]
> **[Read the full docs](tutorials/0_to_litgpt.md)**.
Expand Down Expand Up @@ -267,7 +323,7 @@ Browse all training recipes [here](config_hub).
### Example

```bash
litgpt finetune lora \
litgpt finetune \
--config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/finetune/llama-2-7b/lora.yaml
```

Expand Down Expand Up @@ -422,7 +478,7 @@ seed: 1337
Override any parameter in the CLI:
```bash
litgpt finetune lora \
litgpt finetune \
--config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/finetune/llama-2-7b/lora.yaml \
--lora_r 4
```
Expand Down
4 changes: 4 additions & 0 deletions config_hub/finetune/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ For more information, see the [Dealing with out-of-memory (OOM) errors](../../tu
| llama-2-7b/qlora.yaml | 7B | Alpaca 2k | 4 | 0.814 | 13.68 GB | 512 | 2 | bfloat16 | 45.68 min (A10G) |
| llama-2-7b/full.yaml | 7B | Alpaca 2k | 1 | 0.941 | 26.81 GB | 512 | 4 | bfloat16 | 1.78 min (4xA100) |
| | | | | | | | | | |
| llama-3-8b/lora.yaml | 8B | Alpaca 2k | 2 | 0.890 | 19.73 GB | 512 | 1 | bfloat16 | 14.80 min (A10G) |
| llama-3-8b/qlora.yaml | 8B | Alpaca 2k | 2 | 0.941 | 17.41 GB | 512 | 2 | bfloat16 | 22.34 min (A10G) |
| llama-3-8b/full.yaml | 8B | Alpaca 2k | 1 | 1.451 | 35.48 GB | 512 | 4 | bfloat16 | 2.14 min (4xA100) |
| | | | | | | | | | |
| mistral-7b/lora.yaml (v0.1) | 7B | Alpaca 2k | 4 | 0.796 | 20.65 GB | 512 | 2 | bfloat16 | 31.04 min (1xA10G) |
| mistral-7b/qlora.yaml (v0.1) | 7B | Alpaca 2k | 4 | 0.803 | 14.29 GB | 512 | 2 | bfloat16 | 44.69 min (1xA10G) |
| | | | | | | | | | |
Expand Down
95 changes: 95 additions & 0 deletions config_hub/finetune/llama-3-8b/full.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@

# The path to the base model's checkpoint directory to load for finetuning. (type: <class 'Path'>, default: checkpoints/stabilityai/stablelm-base-alpha-3b)
checkpoint_dir: checkpoints/meta-llama/Meta-Llama-3-8B

# Directory in which to save checkpoints and logs. (type: <class 'Path'>, default: out/finetune/full)
out_dir: out/finetune/full-llama-3-8b

# The precision to use for finetuning. Possible choices: "bf16-true", "bf16-mixed", "32-true". (type: Optional[str], default: null)
precision: bf16-true

# How many devices/GPUs to use (type: Union[int, str], default: 1)
devices: 4

# Path to a checkpoint directory to resume from in case training was interrupted, or ``True`` to resume
# from the latest checkpoint in ``out_dir``. (type: Union[bool, Path], default: False)
resume: false

# Data-related arguments. If not provided, the default is ``litgpt.data.Alpaca``.
data:
class_path: litgpt.data.Alpaca2k
init_args:
mask_prompt: false
prompt_style: alpaca
ignore_index: -100
seed: 42
num_workers: 4

# Training-related arguments. See ``litgpt.args.TrainArgs`` for details
train:

# Number of optimizer steps between saving checkpoints (type: Optional[int], default: 1000)
save_interval: 200

# Number of iterations between logging calls (type: int, default: 1)
log_interval: 1

# Number of samples between optimizer steps across data-parallel ranks (type: int, default: 64)
global_batch_size: 64

# Number of samples per data-parallel rank (type: int, default: 1)
micro_batch_size: 4

# Number of iterations with learning rate warmup active (type: int, default: 100)
lr_warmup_steps: 25

# Number of epochs to train on (type: Optional[int], default: 5)
epochs: 1

# Total number of tokens to train on (type: Optional[int], default: null)
max_tokens:

# Limits the number of optimizer steps to run. (type: Optional[int], default: null)
max_steps:

# Limits the length of samples. Off by default (type: Optional[int], default: null)
max_seq_length: 512

# Whether to tie the embedding weights with the language modeling head weights. (type: Optional[bool], default: null)
tie_embeddings:

# (type: float, default: 0.003)
learning_rate: 0.0002

# (type: float, default: 0.02)
weight_decay: 0.1

# (type: float, default: 0.9)
beta1: 0.9

# (type: float, default: 0.95)
beta2: 0.95

# (type: Optional[float], default: null)
max_norm:

# (type: float, default: 6e-05)
min_lr: 6.0e-05

# Evaluation-related arguments. See ``litgpt.args.EvalArgs`` for details
eval:

# Number of optimizer steps between evaluation calls (type: int, default: 600)
interval: 25

# Number of tokens to generate (type: Optional[int], default: 100)
max_new_tokens: 100

# Number of iterations (type: int, default: 100)
max_iters: 100

# The name of the logger to send metrics to. (type: Literal['wandb', 'tensorboard', 'csv'], default: csv)
logger_name: csv

# The random seed to use for reproducibility. (type: int, default: 1337)
seed: 1337
Loading

0 comments on commit 880c264

Please sign in to comment.