From 57991dc091ff6b40f3a2bef2d72e2310f9e08adc Mon Sep 17 00:00:00 2001 From: rasbt Date: Mon, 15 Apr 2024 21:48:01 +0000 Subject: [PATCH] add docs --- README.md | 24 ++++++++++++++++++- tutorials/0_to_litgpt.md | 37 +++++++++++++++++++++++++++++ tutorials/deploy.md | 51 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 111 insertions(+), 1 deletion(-) create mode 100644 tutorials/deploy.md diff --git a/README.md b/README.md index eaa5fbf7f5..ed5a4fb0dd 100644 --- a/README.md +++ b/README.md @@ -144,7 +144,7 @@ litgpt chat \ ### Continue pretraining an LLM This is another way of finetuning that specialize an already pretrained model by training on custom data: -``` +```bash mkdir -p custom_texts curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt curl https://www.gutenberg.org/cache/epub/26393/pg26393.txt --output custom_texts/book2.txt @@ -166,6 +166,28 @@ litgpt chat \ --checkpoint_dir out/custom-model/final ``` +### Deploy an LLM + +This example illustrates how to deploy an LLM using LitGPT + +```bash +# 1) Download a pretrained model (alternatively, use your own finetuned model) +litgpt download --repo_id microsoft/phi-2 + +# 2) Start the server +litgpt serve --checkpoint_dir checkpoints/microsoft/phi-2 +``` + +```python +# 3) Use the server (in a separate session) +import requests, json + response = requests.post( + "http://127.0.0.1:8000/predict", + json={"prompt": "Fix typos in the following sentence: Exampel input"} +) +print(response.content) +``` +   > [!NOTE] diff --git a/tutorials/0_to_litgpt.md b/tutorials/0_to_litgpt.md index 337bf37049..8e4e6e1902 100644 --- a/tutorials/0_to_litgpt.md +++ b/tutorials/0_to_litgpt.md @@ -464,6 +464,43 @@ litgpt evaluate \ (A list of supported tasks can be found [here](https://github.com/EleutherAI/lm-evaluation-harness/blob/master/docs/task_table.md).) +  +## Deploy LLMs + +You can deploy LitGPT LLMs using your tool of choice. Below is an example using LitGPT built-in serving capabilities: + + +```bash +# 1) Download a pretrained model (alternatively, use your own finetuned model) +litgpt download --repo_id microsoft/phi-2 + +# 2) Start the server +litgpt serve --checkpoint_dir checkpoints/microsoft/phi-2 +``` + +```python +# 3) Use the server (in a separate session) +import requests, json + response = requests.post( + "http://127.0.0.1:8000/predict", + json={"prompt": "Fix typos in the following sentence: Exampel input"} +) +print(response.content) +``` + +This prints: + +``` +b'{"output":"Instruct: Fix typos in the following sentence: Exampel input\\nOutput: Example input: Hello World\\n"}' +``` + + +  +**More information and additional resources** + +- [tutorials/deploy](deploy.md): A full deployment tutorial and example + +   ## Converting LitGPT model weights to `safetensors` format diff --git a/tutorials/deploy.md b/tutorials/deploy.md new file mode 100644 index 0000000000..10ccb85580 --- /dev/null +++ b/tutorials/deploy.md @@ -0,0 +1,51 @@ +# Serve and Deploy LLMs + +This document shows how you can serve a LitGPT for deployment. + +  +## Serve an LLM + +This section illustrates how we can set up an inference server for a phi-2 LLM using `litgpt serve` that is minimal and highly scalable. + + +  +## Step 1: Start the inference server + + +```bash +# 1) Download a pretrained model (alternatively, use your own finetuned model) +litgpt download --repo_id microsoft/phi-2 + +# 2) Start the server +litgpt serve --checkpoint_dir checkpoints/microsoft/phi-2 +``` + +> [!TIP] +> Use `litgpt serve --help` to display additional options, including the port, devices, LLM temperature setting, and more. + + +  +## Step 2: Query the inference server + +You can now send requests to the inference server you started in step 2. For example, in a new Python session, we can send requests to the inference server as follows: + + +```python +import requests, json + +response = requests.post( + "http://127.0.0.1:8000/predict", + json={"prompt": "Fix typos in the following sentence: Exampel input"} +) + +decoded_string = response.content.decode("utf-8") +output_str = json.loads(decoded_string)["output"] +print(output_str) +``` + +Executing the code above prints the following output: + +``` +Instruct: Fix typos in the following sentence: Exampel input +Output: Example input. +```