Skip to content

Commit

Permalink
pretrain starter docs
Browse files Browse the repository at this point in the history
  • Loading branch information
rasbt committed Apr 1, 2024
1 parent 9049794 commit 92cda75
Show file tree
Hide file tree
Showing 3 changed files with 47 additions and 4 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@

 Optimized and efficient code: Flash Attention v2, multi-GPU support via fully-sharded data parallelism, [optional CPU offloading](tutorials/oom.md#do-sharding-across-multiple-gpus), and [TPU and XLA support](extensions/xla).

 [Pretraining](tutorials/pretrain_tinyllama.md), [finetuning](tutorials/finetune.md), and [inference](tutorials/inference.md) in various precision settings: FP32, FP16, BF16, and FP16/FP32 mixed.
 [Pretraining](tutorials/pretrain.md), [finetuning](tutorials/finetune.md), and [inference](tutorials/inference.md) in various precision settings: FP32, FP16, BF16, and FP16/FP32 mixed.

 [Configuration files](config_hub) for great out-of-the-box performance.

Expand All @@ -37,7 +37,7 @@

 [Exporting](tutorials/convert_lit_models.md) to other popular model weight formats.

 Many popular datasets for [pretraining](tutorials/pretrain_tinyllama.md) and [finetuning](tutorials/prepare_dataset.md), and [support for custom datasets](tutorials/prepare_dataset.md#preparing-custom-datasets-for-instruction-finetuning).
 Many popular datasets for [pretraining](tutorials/pretrain.md) and [finetuning](tutorials/prepare_dataset.md), and [support for custom datasets](tutorials/prepare_dataset.md#preparing-custom-datasets-for-instruction-finetuning).

 Readable and easy-to-modify code to experiment with the latest research ideas.

Expand Down Expand Up @@ -114,7 +114,7 @@ For more information, refer to the [download](tutorials/download_model_weights.m

## Finetuning and pretraining

LitGPT supports [pretraining](tutorials/pretrain_tinyllama.md) and [finetuning](tutorials/finetune.md) to optimize models on excisting or custom datasets. Below is an example showing how to finetune a model with LoRA:
LitGPT supports [pretraining](tutorials/pretrain.md) and [finetuning](tutorials/finetune.md) to optimize models on excisting or custom datasets. Below is an example showing how to finetune a model with LoRA:

```bash
# 1) Download a pretrained model
Expand Down Expand Up @@ -336,7 +336,7 @@ If you have general questions about building with LitGPT, please [join our Disco
Tutorials and in-depth feature documentation can be found below:
- Finetuning, incl. LoRA, QLoRA, and Adapters ([tutorials/finetune.md](tutorials/finetune.md))
- Pretraining ([tutorials/pretrain_tinyllama.md](tutorials/pretrain_tinyllama.md))
- Pretraining ([tutorials/pretrain.md](tutorials/pretrain.md))
- Model evaluation ([tutorials/evaluation.md](tutorials/evaluation.md))
- Supported and custom datasets ([tutorials/prepare_dataset.md](tutorials/prepare_dataset.md))
- Quantization ([tutorials/quantize.md](tutorials/quantize.md))
Expand Down
1 change: 1 addition & 0 deletions tutorials/0_to_litgpt.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,7 @@ litgpt pretrain --help

**More information and additional resources**

- [tutorials/pretraimd](./pretrain.md): General information about pretraining in LitGPT
- [tutorials/pretrain_tinyllama](./pretrain_tinyllama.md): A tutorial for finetuning a 1.1B TinyLlama model on 3 trillion tokens
- [config_hub/pretrain](../config_hub/pretrain): Pre-made config files for pretraining that work well out of the box
- Project templates in reproducible environments with multi-GPU and multi-node support:
Expand Down
42 changes: 42 additions & 0 deletions tutorials/pretrain.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,54 @@
The simplest way to get started with pretraining LLMs in LitGPT ...


 
## The Pretraining API

You can pretrain models in LitGPT using the `litgpt pretrain` API starting with any of the available architectures listed by calling `litgpt pretrain` without any additional arguments:

```bash
litgpt pretrain
```

Shown below is an abbreviated list

```
ValueError: Please specify --model_name <model_name>. Available values:
Camel-Platypus2-13B
...
Gemma-2b
...
Llama-2-7b-hf
...
Mixtral-8x7B-v0.1
...
pythia-14m
```

For demonstration purposes, we can pretrain a small 14 million-parameter Pythia model on the small TinyStories dataset using the [debug.yaml config file](https://github.com/Lightning-AI/litgpt/blob/main/config_hub/pretrain/debug.yaml) as follows:

```bash
litgpt pretrain \
--model_name pythia-14m \
--config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/pretrain/debug.yaml
```




&nbsp;
## Pretrain a 1.1B TinyLlama model

You can find an end-to-end LitGPT tutorial for pretraining a TinyLlama model using LitGPT [here](pretrain_tinyllama.md).


&nbsp;
## Optimize LitGPT pretraining with Lightning Thunder

[Lightning Thunder](https://github.com/Lightning-AI/lightning-thunder) is a source-to-source compiler for PyTorch, which is fully compatible with LitGPT. In experiments, Thunder resulted in a 40% speed-up compared to using regular PyTorch when finetuning a 7B Llama 2 model.

For more information, see the [Lightning Thunder extension README](https://github.com/Lightning-AI/lightning-thunder).


&nbsp;
## Project templates
Expand Down

0 comments on commit 92cda75

Please sign in to comment.