[Feature request] streamer callback for text-generation task #394

seonglae · 2023-11-16T07:54:25Z

Streamer
https://huggingface.co/docs/transformers/generation_strategies#streaming

Reason for request
Currently, iterating max_new_tokens: 1 takes much longer time than single generation. Text generation takes time even for light model. Token streaming is key feature for user experience. In my case. Task-specific text generation could be a key feature of AI app development using transformers.js with low cost.

Additional context
I'm not sure the TextStreamer class need to be compatibility with python transformers. I wrote an use case proposal with TextStreamer extends TransformStream. AsyncIterable, AsynsGeneator and Stream API might be usable.

Suggesting streaming code

let aggregatedResponse = '';
const streamer = TextStreamer()
const pipe: QuestionAnsweringPipeline = await pipeline(
  'text-generation',
  model,
  { quantized: true }
)
pipe(prompt, { streamer: streamer })
res = new Response(streamer.readable)

This is vercel's approach
https://github.com/vercel/ai/blob/main/packages/core/streams/ai-stream.ts
https://github.com/vercel-labs/ai-chatbot/blob/main/app/api/chat/route.ts

The text was updated successfully, but these errors were encountered:

xenova · 2023-11-16T15:55:14Z

Hi there 👋 I definitely think the addition of an equivalent TextStreamer class to the library will be great! If someone in the community would like to contribute this, it should be as simple as rewriting this file in JavaScript.

The current approach to text streaming (which was actually added before the python library added TextStreamer) is to add a callback_function to the generate/pipeline function. For example:

const pipe = await pipeline(
  'text-generation',
  model,
  { quantized: true }
)
pipe(prompt, { callback_function: beams => { console.log(beams) }})

Here's an example of streaming + decoding:

https://github.com/xenova/transformers.js/blob/4e4148cb5ce7f4a9265f58b4eeb660c64bed0386/examples/demo-site/src/worker.js#L189-L202

wujohns · 2024-05-14T12:24:25Z

@xenova How to define the callback_function to make the text-generation stop at special words(like the openai api's "stop" param)
I also find the code from transformer.js you shows, but I am confused about how to do this next

02shanks · 2024-08-18T09:12:00Z

@xenova is this issue still open for contribution?

seonglae · 2024-11-19T14:57:49Z

@xenova I realized after updating from version @xenova/transformers: ^2.17.2 to @huggingface/transformers ^3.0.2, the callback_function option does not work.

seonglae · 2024-11-19T15:29:46Z

const pipe = await pipeline('text-generation', 'Xenova/LiteLlama-460M-1T', {
      dtype: 'q8',
      model_file_name: 'decoder_model_merged'
    })
    let response = ''
    await pipe(
      `
### Context: General Relativity and Special Relativity are two main topics in Relative Mechanics.
Einstein Field Equatioqns is mathmetical model for General Relativity
### Question: What is Relative Mechanics?
### Response:`,
      {
        max_length: 500,
        skip_prompt: true,
        callback_function: beams => {
          const tokens = beams[0].output_token_ids
          const decodedText = pipe.tokenizer.decode(tokens.slice(tokens.length - 1, tokens.length), {
            skip_special_tokens: true
          })
          response += decodedText
          process.stdout.write(response)
        }
      }
    )
    console.info(response)

With this code, version 3.0.0, 3.0.1 and 3.0.2 does not stream the tokens and version 2.17.2 only works for this callback_function implementation properly

djaffer · 2024-12-02T11:24:32Z

The calback_function doesn't get triggered.

Looks The quality of code is questionable. Main areas do not work. Already npm install package has issues now this too.

xenova · 2024-12-02T13:57:36Z

Hi all 👋 Apologies for not updating the thread. In Transformers.js v3, the non-standard callback_function was removed in favour of the TextStreamer class, which can be used as follows:

import { pipeline, TextStreamer } from "@huggingface/transformers";

// Create a text generation pipeline
const generator = await pipeline(
  "text-generation",
  "onnx-community/Qwen2.5-Coder-0.5B-Instruct",
  { dtype: "q4" },
);

// Define the list of messages
const messages = [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content:  "Write a quick sort algorithm." },
];

// Create text streamer
const streamer = new TextStreamer(generator.tokenizer, {
  skip_prompt: true,
  callback_function: (text) => console.log(text), // Optional callback function
})

// Generate a response
const output = await generator(messages, { max_new_tokens: 512, do_sample: false, streamer });
console.log(output[0].generated_text.at(-1).content);

Let me know if that helps! Also, if someone would be interested in contributing this to the docs and example projects, that would be amazing 🤗

seonglae · 2024-12-03T11:18:09Z

Resolved from #1066 by adding streaming documents and enhancing type support.

djaffer · 2024-12-03T13:21:52Z

Working thanks!

seonglae added the enhancement New feature or request label Nov 16, 2023

xenova added help wanted Extra attention is needed good first issue Good for newcomers labels Nov 16, 2023

seonglae mentioned this issue Dec 2, 2024

Add an example and type enhancement for TextStreamer #1066

Merged

seonglae closed this as completed Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] streamer callback for text-generation task #394

[Feature request] streamer callback for text-generation task #394

seonglae commented Nov 16, 2023

xenova commented Nov 16, 2023

wujohns commented May 14, 2024

02shanks commented Aug 18, 2024

seonglae commented Nov 19, 2024 •

edited

Loading

seonglae commented Nov 19, 2024 •

edited

Loading

djaffer commented Dec 2, 2024

xenova commented Dec 2, 2024

seonglae commented Dec 3, 2024

djaffer commented Dec 3, 2024

[Feature request] streamer callback for text-generation task #394

[Feature request] streamer callback for text-generation task #394

Comments

seonglae commented Nov 16, 2023

xenova commented Nov 16, 2023

wujohns commented May 14, 2024

02shanks commented Aug 18, 2024

seonglae commented Nov 19, 2024 • edited Loading

seonglae commented Nov 19, 2024 • edited Loading

djaffer commented Dec 2, 2024

xenova commented Dec 2, 2024

seonglae commented Dec 3, 2024

djaffer commented Dec 3, 2024

seonglae commented Nov 19, 2024 •

edited

Loading

seonglae commented Nov 19, 2024 •

edited

Loading