Release 0.17a0

Refs #587, #590
simonw · Oct 28, 2024 · ba1ccb3 · ba1ccb3
1 parent 758ff9a
commit ba1ccb3
Show file tree

Hide file tree

Showing 4 changed files with 40 additions and 2 deletions.
diff --git a/docs/changelog.md b/docs/changelog.md
@@ -1,5 +1,39 @@
 # Changelog
 
+(v0_17a0)=
+## 0.17a0 (2024-10-28)
+
+Alpha support for **attachments**, allowing multi-modal models to accept images, audio, video and other formats. [#578](https://github.com/simonw/llm/issues/578)
+
+Attachments {ref}`in the CLI <usage-attachments>` can be URLs:
+
+```bash
+llm "describe this image" \
+  -a https://static.simonwillison.net/static/2024/pelicans.jpg
+```
+Or file paths:
+```bash
+llm "extract text" -a image1.jpg -a image2.jpg
+```
+Or binary data, which may need to use `--attachment-type` to specify the MIME type:
+```bash
+cat image | llm "extract text" --attachment-type - image/jpeg
+```
+
+Attachments are also available {ref}`in the Python API <python-api-attachments>`:
+
+```python
+model = llm.get_model("gpt-4o-mini")
+response = model.prompt(
+    "Describe these images",
+    attachments=[
+        llm.Attachment(path="pelican.jpg"),
+        llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg"),
+    ]
+)
+```
+Plugins that provide alternative models can support attachments, see {ref}`advanced-model-plugins-attachments` for details.
+
 (v0_16)=
 ## 0.16 (2024-09-12)
 

diff --git a/docs/python-api.md b/docs/python-api.md
@@ -49,6 +49,9 @@ response = model.prompt(
     system="Answer like GlaDOS"
 )
 ```
+
+(python-api-attachments)=
+
 ### Attachments
 
 Model that accept multi-modal input (images, audio, video etc) can be passed attachments using the `attachments=` keyword argument. This accepts a list of `llm.Attachment()` instances.

diff --git a/docs/usage.md b/docs/usage.md
@@ -45,6 +45,7 @@ Some models support options. You can pass these using `-o/--option name value` -
 ```bash
 llm 'Ten names for cheesecakes' -o temperature 1.5
 ```
+(usage-attachments)=
 ### Attachments
 
 Some models are multi-modal, which means they can accept input in more than just text. GPT-4o and GPT-4o mini can accept images, and models such as Google Gemini 1.5 can accept audio and video as well.
@@ -56,7 +57,7 @@ llm "describe this image" -a https://static.simonwillison.net/static/2024/pelica
 ```
 Attachments can be passed using URLs or file paths, and you can attach more than one attachment to a single prompt:
 ```bash
-llm "describe these images" -a image1.jpg -a image2.jpg
+llm "extract text" -a image1.jpg -a image2.jpg
 ```
 You can also pipe an attachment to LLM by using `-` as the filename:
 ```bash

diff --git a/setup.py b/setup.py
@@ -1,7 +1,7 @@
 from setuptools import setup, find_packages
 import os
 
-VERSION = "0.16"
+VERSION = "0.17a0"
 
 
 def get_long_description():