add docs

Lightning-AI · Apr 15, 2024 · 57991dc · 57991dc
1 parent 0688ae1
commit 57991dc
Show file tree

Hide file tree

Showing 3 changed files with 111 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -144,7 +144,7 @@ litgpt chat \
 ### Continue pretraining an LLM       
 This is another way of finetuning that specialize an already pretrained model by training on custom data:    
 
-```
+```bash
 mkdir -p custom_texts
 curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt
 curl https://www.gutenberg.org/cache/epub/26393/pg26393.txt --output custom_texts/book2.txt
@@ -166,6 +166,28 @@ litgpt chat \
   --checkpoint_dir out/custom-model/final
 ```
 
+### Deploy an LLM
+
+This example illustrates how to deploy an LLM using LitGPT
+
+```bash
+# 1) Download a pretrained model (alternatively, use your own finetuned model)
+litgpt download --repo_id microsoft/phi-2
+
+# 2) Start the server
+litgpt serve --checkpoint_dir checkpoints/microsoft/phi-2
+```
+
+```python
+# 3) Use the server (in a separate session)
+import requests, json
+ response = requests.post(
+     "http://127.0.0.1:8000/predict", 
+     json={"prompt": "Fix typos in the following sentence: Exampel input"}
+)
+print(response.content)
+```
+
 &nbsp;
 
 > [!NOTE]

diff --git a/tutorials/0_to_litgpt.md b/tutorials/0_to_litgpt.md
@@ -464,6 +464,43 @@ litgpt evaluate \
 (A list of supported tasks can be found [here](https://github.com/EleutherAI/lm-evaluation-harness/blob/master/docs/task_table.md).)
 
 
+&nbsp;
+## Deploy LLMs
+
+You can deploy LitGPT LLMs using your tool of choice. Below is an example using LitGPT built-in serving capabilities:
+
+
+```bash
+# 1) Download a pretrained model (alternatively, use your own finetuned model)
+litgpt download --repo_id microsoft/phi-2
+
+# 2) Start the server
+litgpt serve --checkpoint_dir checkpoints/microsoft/phi-2
+```
+
+```python
+# 3) Use the server (in a separate session)
+import requests, json
+ response = requests.post(
+     "http://127.0.0.1:8000/predict", 
+     json={"prompt": "Fix typos in the following sentence: Exampel input"}
+)
+print(response.content)
+```
+
+This prints:
+
+```
+b'{"output":"Instruct: Fix typos in the following sentence: Exampel input\\nOutput: Example input: Hello World\\n"}'
+```
+
+
+&nbsp;
+**More information and additional resources**
+
+- [tutorials/deploy](deploy.md): A full deployment tutorial and example
+
+
 &nbsp;
 ## Converting LitGPT model weights to `safetensors` format
 

diff --git a/tutorials/deploy.md b/tutorials/deploy.md
@@ -0,0 +1,51 @@
+# Serve and Deploy LLMs
+
+This document shows how you can serve a LitGPT for deployment. 
+
+&nbsp;
+## Serve an LLM
+
+This section illustrates how we can set up an inference server for a phi-2 LLM using `litgpt serve` that is minimal and highly scalable.
+
+
+&nbsp;
+## Step 1: Start the inference server
+
+
+```bash
+# 1) Download a pretrained model (alternatively, use your own finetuned model)
+litgpt download --repo_id microsoft/phi-2
+
+# 2) Start the server
+litgpt serve --checkpoint_dir checkpoints/microsoft/phi-2
+```
+
+> [!TIP]
+> Use `litgpt serve --help` to display additional options, including the port, devices, LLM temperature setting, and more.
+
+
+&nbsp;
+## Step 2: Query the inference server
+
+You can now send requests to the inference server you started in step 2. For example, in a new Python session, we can send requests to the inference server as follows:
+
+
+```python
+import requests, json
+
+response = requests.post(
+    "http://127.0.0.1:8000/predict", 
+    json={"prompt": "Fix typos in the following sentence: Exampel input"}
+)
+
+decoded_string = response.content.decode("utf-8")
+output_str = json.loads(decoded_string)["output"]
+print(output_str)
+```
+
+Executing the code above prints the following output:
+
+```
+Instruct: Fix typos in the following sentence: Exampel input
+Output: Example input.
+```