Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Async streaming validation duplicates output in the presence of multiple validators #1090

Open
JosephCatrambone opened this issue Sep 25, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@JosephCatrambone
Copy link
Contributor

Quick thank you to user new in the Discord. Link to the thread: https://discord.com/channels/1085077079697150023/1288085320805388298/1288085864521666591

Describe the bug
When AsyncGuard.use_many is given multiple guards (in this case, DetectPII and ToxicLanguage) with on_fail="fix", the stream of outputs will be reduplicated.

To Reproduce
Steps to reproduce the behavior:

import asyncio
import os
from dotenv import load_dotenv
import litellm
import openai
import guardrails

from guardrails.hub import DetectPII, ToxicLanguage


from typing import Callable, Dict, Optional
from guardrails.validators import (
    PassResult,
    register_validator,
    ValidationResult,
    Validator,
)


# Load environment variables
load_dotenv()
openai.api_base = os.getenv("OPENAI_API_BASE")
openai.api_key = os.getenv("OPENAI_API_KEY")


# VERSION 1
guard = guardrails.AsyncGuard().use_many(
    DetectPII(pii_entities="pii", on_fail="fix")
)

# VERSION 2
# guard = guardrails.AsyncGuard().use_many(
#    DetectPII(pii_entities="pii", on_fail="fix"), ToxicLanguage(on_fail="fix")
# )

async def generate_text():
    fragment_generator = await guard(
            litellm.acompletion,
            api_key=openai.api_key,
            api_base=openai.api_base,
            model="openai/mistralai/Mistral-Nemo-Instruct-2407",
            messages=[
                {"role": "system", "content": "Only write my sentences provided please and nothing else please."},
                {
                    "role": "user",
                    "content": """Peter is funny and lives in New York. My name is Peter. Who are you Brian ?""",
                },
            ],
            max_tokens=1024,
            temperature=0,
            stream=True,
        )
    
    text = ""
    async for op in fragment_generator:
            print(op)
            await asyncio.sleep(0)
            text += op.validated_output

    print(text)
# Run the async function to generate text
import asyncio

asyncio.run(generate_text())

Expected behavior
Example model output: "My friend Alex is a researcher at Purdue University. (Numerous Obscenities)"
Expected cleaned output: "My friend is a researcher at ."
Observed output: "My friend My friend friend is a researcher at friend is a researcher at ."

Library version:
Guardrails 0.5.10

Additional context
Happens in a notebook and in a terminal.

@JosephCatrambone JosephCatrambone added the bug Something isn't working label Sep 25, 2024
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added the Stale label Oct 26, 2024
@JosephCatrambone
Copy link
Contributor Author

CC @nichwch I think your async changes fixed this, right?

@github-actions github-actions bot removed the Stale label Nov 1, 2024
Copy link

github-actions bot commented Dec 1, 2024

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 14 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants