Add support for `min_length` and `min_new_tokens` generation parameters #308

xenova · 2023-09-16T00:49:04Z

Closes #285

TODO:

Testing models other than Xenova/LaMini-Flan-T5-783M (+ decoder-only models)
Confirm correctness against python library

Example Usage

Text2text-generation

javascript (ours)

import { pipeline } from '@xenova/transformers';
let generator = await pipeline('text2text-generation', 'Xenova/LaMini-Flan-T5-248M', {quantized:false});
let output = await generator('How can I become healthier?', {
    max_new_tokens: 512,
    min_length: 80,
});

['To become healthier, you can: 1. Eat a balanced and nutritious diet 2. Exercise regularly 3. Get enough sleep 4. Manage stress 5. Practice good hygiene 6. Limit alcohol and tobacco consumption 7. Stay hydrated 8. Avoid smoking and excessive alcohol consumption 9. Seek medical attention if you are experiencing any health issues. 10. Practice good hygiene and sanitation. Remember to always prioritize your health and well-being. Good luck!']

python (original):

from transformers import pipeline
pipe = pipeline('text2text-generation', 'MBZUAI/LaMini-Flan-T5-248M')
output = pipe('How can I become healthier?',
              max_new_tokens=512,
              min_length=80,
)

[{'generated_text': 'To become healthier, you can: 1. Eat a balanced and nutritious diet 2. Exercise regularly 3. Get enough sleep 4. Manage stress 5. Practice good hygiene 6. Limit alcohol and tobacco consumption 7. Stay hydrated 8. Avoid smoking and excessive alcohol consumption 9. Seek medical attention if you are experiencing any health issues. 10. Practice good hygiene and sanitation. Remember to always prioritize your health and well-being. Good luck!'}]

NOTE: The output is identical, except the python version has the generated_text key. This will be fixed in another PR (not here since it's a breaking change).

Text-generation

javascript (ours):

import { pipeline } from '@xenova/transformers';

let instruction = 'How can I become healthier?'
let input_prompt = `Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n${instruction}\n\n### Response:`

let generator = await pipeline('text-generation', 'Xenova/LaMini-GPT-124M', { quantized: false });
let output = await generator(input_prompt, {
    max_new_tokens: 512,
    do_sample: false,
    no_repeat_ngram_size: 2,
});

[{"generated_text": "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\nHow can I become healthier?\n\n### Response:To become healthy, you can: \n1. Eat a balanced and nutritious diet. 2. Reduce stress and anxiety. 3. Practice good sleep habits. 4. Engage in physical activity. 5. Get enough sleep."}]

python (original):

pipe = pipeline('text-generation', 'MBZUAI/LaMini-GPT-124M')
instruction = 'How can I become healthier?'
input_prompt = f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"
output = pipe(input_prompt,
              max_new_tokens=512,
              do_sample=False,
              no_repeat_ngram_size=2,
)

[{'generated_text': 'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\nHow can I become healthier?\n\n### Response:To become healthy, you can: \n1. Eat a balanced and nutritious diet. 2. Reduce stress and anxiety. 3. Practice good sleep habits. 4. Engage in physical activity. 5. Get enough sleep.'}]

HuggingFaceDocBuilderDev · 2023-09-16T00:55:17Z

The documentation is not available anymore as the PR was closed or merged.

xenova added 2 commits September 16, 2023 02:35

Add support for MinNewTokensLengthLogitsProcessor

f1d1263

Add support for MinLengthLogitsProcessor

633b28b

xenova added 5 commits September 16, 2023 21:51

Fix generation_config defaults

cd30078

Fix input_ids_seq_length

e541f56

Add unit tests for generation

7a8306f

Fix generation parameters test case

76cbd7c

Allow specification of multiple eos_token_ids

2ec26dd

xenova merged commit 11f6a08 into main Sep 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for `min_length` and `min_new_tokens` generation parameters #308

Add support for `min_length` and `min_new_tokens` generation parameters #308

xenova commented Sep 16, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 16, 2023 •

edited

Loading

Add support for min_length and min_new_tokens generation parameters #308

Add support for min_length and min_new_tokens generation parameters #308

Conversation

xenova commented Sep 16, 2023 • edited Loading

Example Usage

Text2text-generation

Text-generation

HuggingFaceDocBuilderDev commented Sep 16, 2023 • edited Loading

Add support for `min_length` and `min_new_tokens` generation parameters #308

Add support for `min_length` and `min_new_tokens` generation parameters #308

xenova commented Sep 16, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Sep 16, 2023 •

edited

Loading