merge upstream #2

francislabountyjr · 2024-06-04T04:17:33Z

No description provided.

Fixes [this issue](#743)

…r individual bytes

Some model use bytes as their tokens, such as Qwen (see: https://huggingface.co/Qwen/Qwen-7B/blob/ef3c5c9c57b252f3149c1408daf4d649ec8b6c85/tokenization_qwen.py#L136 )

Closes #757

…t error

#776

Switched order of AzureAsyncOpenAI to AsyncAzureOpenAI to match name in repo https://github.com/openai/openai-python/blob/main/src/openai/lib/azure.py

- Will's structured generation workflow cookbook example was not in the mkdocs index, so it was not being displayed. - Same with the LM Studio serving docs. - The brand color was also slightly off: ![image](https://github.com/user-attachments/assets/fd10fa4f-d140-4936-befa-4dcca09c0e51) It has been fixed to this: ![image](https://github.com/user-attachments/assets/b6c2d71b-6a7f-4b86-935a-bf5072f1d945)

Request received in discord to add an example for the new transformers vision capability. # Vision-Language Models with Outlines This guide demonstrates how to use Outlines with vision-language models, leveraging the new transformers_vision module. Vision-language models can process both text and images, allowing for tasks like image captioning, visual question answering, and more. We will be using the Pixtral-12B model from Mistral to take advantage of some of its visual reasoning capabilities and a workflow to generate a multistage atomic caption. --------- Signed-off-by: jphillips <[email protected]>

accross -> across

This is a condensed version of the demo for [extracting earnings reports](https://github.com/dottxt-ai/demos/tree/main/earnings-reports) to CSV. Overview: - Shows how to use Outlines to structure CSV output - Provides simple tools for converting a table specification to regular expressions - Includes a tuned extraction prompt that performs reasonably well on income statements

Adds a cookbook on extracting structured output from PDFs. I included some extra bells and whistles here by showing how to do JSON, regex, and `choice`, which should help provide inspiration to people working with PDFs.

Forgot to add the earnings report cookbook to the cookbook index (#1235), this fixes it.

I added a receipt processing cookbook. - Uses Qwen or Pixtral - General purpose message templating, no messy model-specific token adding - Easy function for compressing images down for lower processing/memory requirements Should help illustrate a simple use case for vision models.

Fix that error NameError: name 'rng' is not defined

`[Outlines model](../models)` does not return the link correctly. Tried switching to `[Outlines model](../models/models.md)`

This PR adds a JAX compatible API, refer issue #1027

This PR aims at solving #1217

Users are currently running into install issues. After a clean install of `outlines` they get an error message that asks for `transformers` to be installed. This should not be the case, as the library is not required for every integration. In this PR we remove `transformers` and `datasets` top-level imports, and add per-integration optional dependencies. ## TODO - [x] Test `import outlines` from clean install - [x] Test installing outlines with vLLM optional dependencies - [x] Test installing outlines with MLX optional dependencies - [x] Test installing outlines with transformers optional dependencies - [x] Test installing outlines with llama-cpp optional dependencies - [x] Test installing outlines with exllamav2 optional dependencies - [x] Test installing outlines with openai optional dependencies - [x] Update the documentation Supersedes #1295. Fixes #1263.

rlouf and others added 30 commits March 13, 2024 15:30

Update the docstring of exl2

6484d8c

Pass 'model_kwargs for outlines.models.llamacpp as dict (#744)

5c15e8c

Fixes [this issue](#743)

Add a function to convert utf8 regexps into regexps that operates ove…

d7295a7

…r individual bytes

Support generating multi-byte utf8 characters

043117f

Make model_kwargs dictionary by default

c8566e8

Check if the given token is a string (#745)

aa0a35e

Some model use bytes as their tokens, such as Qwen (see: https://huggingface.co/Qwen/Qwen-7B/blob/ef3c5c9c57b252f3149c1408daf4d649ec8b6c85/tokenization_qwen.py#L136 )

Add BibteX citation

f7cafe5

fixed parsing token vocabularies for gemma and gpt-sw3 models

c744e25

Do not reset RegexLogitsProcessor._fsm_state (#760)

803439a

Closes #757

Improve syntax highlighting

656bafa

Add feedback tweet

7d611f7

Expand contribution documentation

55c2b96

Update installation instructions

d825d0c

Improve the documentation for structured generation

aed9d21

Small typo fix to examples in cookbook which had resulted in an impor…

1d20896

…t error

Allow json integers to be negative (#777)

ce06900

#776

Add vLLM integration

65dec32

Update code blocks style

ebf2d6d

Add companies using Outlines

c549b1f

Add nvidia to list of companies using Outlines

8da5486

Add nvidia logo

a21ebec

Add SGLang as a user of outlines

3a41b0e

fix missing text module

be662fe

Change white space pattern in llama.cpp test

121a25c

Update the llama.cpp integration

aacc633

Add integrations tests for the vLLM integration

ae9ae50

Switched order of AzureAsyncOpenAI to AsyncAzureOpenAI

366ea1b

Switched order of AzureAsyncOpenAI to AsyncAzureOpenAI to match name in repo https://github.com/openai/openai-python/blob/main/src/openai/lib/azure.py

Add downloads badge

9f15e28

Add a small grammar guide

868868f

Remove unused dependencies

cb8fea8

lapp0 and others added 30 commits October 7, 2024 21:05

ensure valid link to docs with latest/ in url

9e8bd6c

use not in mkdocs

a2fd35c

Create lmstudio.md

c1fae8a

use pre-commit

64fb30f

recover fsm_union, get_sub_fsms_from_seq, walk_fsm. Add to fsm/parser.py

866b9a3

update dependencies: torch is required, pin outlines-core==0.1.14

eabca69

update RegexGuide to conform with outlines-core

6f36b71

test fsm_union and walk_fsm

969887e

Update llamacpp.md (#1231)

dc31b9b

accross -> across

Add PDF cookbook (#1256)

906e84e

Adds a cookbook on extracting structured output from PDFs. I included some extra bells and whistles here by showing how to do JSON, regex, and `choice`, which should help provide inspiration to people working with PDFs.

Add earnings reports to cookbook index (#1255)

b93f550

Forgot to add the earnings report cookbook to the cookbook index (#1235), this fixes it.

Update README.md (#1258)

d842522

Fix that error NameError: name 'rng' is not defined

Bump to outlines-core=0.1.17 for python 3.12-3.13 support (#1273)

c406da8

Turn off guide caching

2eab80c

Fix index interface in tests

f099f96

Update generation.md

b55d314

`[Outlines model](../models)` does not return the link correctly. Tried switching to `[Outlines model](../models/models.md)`

Add link to YT channel and .txt blog

bf62d11

Update home page spacing

7a9baad

Add compatibility for numpy 2 while preserving numpy 1 compatibility

63b4feb

Update mlx-lm kvcache creation

7cdaeac

Add beam deployment example

2db34c5

Add OpenAI to LogitsGenerator

66f6627

Add cookbook recipe to extract event details from text (#1269)

568f252

Add jax compatible api (#1207)

5608dd8

This PR adds a JAX compatible API, refer issue #1027

Add json call with multi-function enums (#1277)

e9485cf

This PR aims at solving #1217

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge upstream #2

merge upstream #2

francislabountyjr commented Jun 4, 2024

merge upstream #2

Are you sure you want to change the base?

merge upstream #2

Conversation

francislabountyjr commented Jun 4, 2024