Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for min_length and min_new_tokens generation parameters #308

Merged
merged 7 commits into from
Sep 17, 2023

Conversation

xenova
Copy link
Collaborator

@xenova xenova commented Sep 16, 2023

Closes #285

TODO:

  • Testing models other than Xenova/LaMini-Flan-T5-783M (+ decoder-only models)
  • Confirm correctness against python library

Example Usage

Text2text-generation

javascript (ours)

import { pipeline } from '@xenova/transformers';
let generator = await pipeline('text2text-generation', 'Xenova/LaMini-Flan-T5-248M', {quantized:false});
let output = await generator('How can I become healthier?', {
    max_new_tokens: 512,
    min_length: 80,
});

['To become healthier, you can: 1. Eat a balanced and nutritious diet 2. Exercise regularly 3. Get enough sleep 4. Manage stress 5. Practice good hygiene 6. Limit alcohol and tobacco consumption 7. Stay hydrated 8. Avoid smoking and excessive alcohol consumption 9. Seek medical attention if you are experiencing any health issues. 10. Practice good hygiene and sanitation. Remember to always prioritize your health and well-being. Good luck!']

python (original):

from transformers import pipeline
pipe = pipeline('text2text-generation', 'MBZUAI/LaMini-Flan-T5-248M')
output = pipe('How can I become healthier?',
              max_new_tokens=512,
              min_length=80,
)

[{'generated_text': 'To become healthier, you can: 1. Eat a balanced and nutritious diet 2. Exercise regularly 3. Get enough sleep 4. Manage stress 5. Practice good hygiene 6. Limit alcohol and tobacco consumption 7. Stay hydrated 8. Avoid smoking and excessive alcohol consumption 9. Seek medical attention if you are experiencing any health issues. 10. Practice good hygiene and sanitation. Remember to always prioritize your health and well-being. Good luck!'}]

NOTE: The output is identical, except the python version has the generated_text key. This will be fixed in another PR (not here since it's a breaking change).

Text-generation

javascript (ours):

import { pipeline } from '@xenova/transformers';

let instruction = 'How can I become healthier?'
let input_prompt = `Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n${instruction}\n\n### Response:`

let generator = await pipeline('text-generation', 'Xenova/LaMini-GPT-124M', { quantized: false });
let output = await generator(input_prompt, {
    max_new_tokens: 512,
    do_sample: false,
    no_repeat_ngram_size: 2,
});

[{"generated_text": "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\nHow can I become healthier?\n\n### Response:To become healthy, you can: \n1. Eat a balanced and nutritious diet. 2. Reduce stress and anxiety. 3. Practice good sleep habits. 4. Engage in physical activity. 5. Get enough sleep."}]

python (original):

pipe = pipeline('text-generation', 'MBZUAI/LaMini-GPT-124M')
instruction = 'How can I become healthier?'
input_prompt = f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"
output = pipe(input_prompt,
              max_new_tokens=512,
              do_sample=False,
              no_repeat_ngram_size=2,
)

[{'generated_text': 'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\nHow can I become healthier?\n\n### Response:To become healthy, you can: \n1. Eat a balanced and nutritious diet. 2. Reduce stress and anxiety. 3. Practice good sleep habits. 4. Engage in physical activity. 5. Get enough sleep.'}]

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 16, 2023

The documentation is not available anymore as the PR was closed or merged.

@xenova xenova merged commit 11f6a08 into main Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The generate API always returns the same number of tokens as output nomatter what is min_tokens
2 participants