Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect the generators to prompts in langfuse #1154

Open
alex-stoica opened this issue Oct 24, 2024 · 14 comments
Open

Connect the generators to prompts in langfuse #1154

alex-stoica opened this issue Oct 24, 2024 · 14 comments
Labels
feature request Ideas to improve an integration integration:langfuse P3

Comments

@alex-stoica
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Hello! Currently, pipeline runs (equivalent to traces in Langfuse) do not have their corresponding prompts tied with the "generation" blocks. This can be seen in the example from the Haystack Langfuse Integration blog post.
image

Describe the solution you'd like
When prompts exist, they should be added to the tracer file, such as in the following example:

m = meta[0]
try:
    prompt = self._tracer.get_prompt(m.get("prompt_name")) # assumes that in model's output there is an additional parameter "prompt_name"
    if prompt:
        span._span.update(prompt=prompt)
except Exception as e:
    print(f"Prompt not found or error occurred: {e}")
span._span.update(usage=m.get("usage") or None, model=m.get("model"))

This would ensure that prompts are tied to the corresponding generation blocks, giving better traceability.

Describe alternatives you've considered
N/A

Additional context
Desired output:
image

One challenge is that the prompt name is not available in the generator by default. This could complicate the integration, as it may require accessing the prompt metadata manually. In some cases, creating a custom generator that explicitly handles the prompt name and metadata might be necessary for an integration with Langfuse + allowing that custom generator to be traced (see #1153)

@alex-stoica alex-stoica added the feature request Ideas to improve an integration label Oct 24, 2024
@vblagoje
Copy link
Member

Hey @alex-stoica let me see if I understand everything correctly here. It seems the assumption that prompt_name is, by default, added to ChatMessage meta seems a bit over the top and rather langfuse specific. Am I understanding this correctly? Isn't this prompt_name info available somewhere in langfuse meta by any chance?

@alex-stoica
Copy link
Contributor Author

Hello @vblagoje,
To add context: tracking users, prompts, and generation scores is valuable—especially for analyzing prompts. Currently, Haystack's Langfuse integration:

  • Creates a trace per run (good, 1:1 correspondence)
  • Logs a generation per generator run in the pipeline (also useful)

However, it lacks prompt linkage to generations. In Langfuse, prompts (with model params) typically link directly to generations, aiding in tracking I/O per prompt and evaluating prompt effectiveness.

To address your question: yes, it’s specific to Langfuse to require prompt names / objecgts for generations, and Haystack's modular design doesn’t natively support this.

What I did on my side is to

  1. Create a custom prompt builder that expects either a prompt name to load from langfuse or a classical jinja prompt template
  2. Associate also the prompt when associating the generator
        if tags.get("haystack.component.type") in _SUPPORTED_GENERATORS:
            
            meta = span._data.get("haystack.component.output", {}).get("meta")
            if meta:
                # Haystack returns one meta dict for each message, but the 'usage' value
                # is always the same, let's just pick the first item
                m = meta[0]
                print(m)
                try:
                    prompt = self._tracer.get_prompt(m.get("prompt_name"))
                    if prompt:
                        span._span.update(prompt=prompt)
                except Exception as e:
                    print(f"Prompt not found or error occurred: {e}")

This is a patch I wouldn't put that into production, but I see the need to link multiple components together when tracing in the future, and that's not that easy to fix Haystack's "blocky" design

@vblagoje
Copy link
Member

I wouldn't necessarily agree with blocky design statement :-) but let's not go there, let's build. In Langfuse integration we can indeed introduce LangfusePromptBuilder that loads prompts directly from Langfuse platform, renders them and also injects this prompt_name field into meta. This way we can link the two and continue to build Langfuse integration further. Would you be willing to contribute this, I can review and approve?

@alex-stoica
Copy link
Contributor Author

I’d be happy to contribute. However, my local implementation, which worked, took the following approach:

  • created a custom LangfusePromptBuilder that loads the prompt by name if a name is provided; otherwise, it creates a prompt based on the template (either a template or a name must be supplied).
  • the CustomLangfuseGenerator accepts the prompt name as input and also includes it in the output.
  • in tracer.py, the prompt is loaded again by name and attached to the Langfuse span.

This approach functions more as a patch than a robust design. Langfuse expects the prompt object attached, not the prompt name. I needed a quick fix, so in my case, the prompt is loaded, compiled, passed both as a name and as a compiled template in the generator, returned as a name, and then loaded again for Langfuse. This cycle feels inelegant, so I’d appreciate any guidance you could offer on a cleaner, more efficient implementation

@vblagoje
Copy link
Member

We could do this in a more simple approach that can work with all chat generators (we are deprecating generators btw). LangfusePromptBuilder component can have a field prompt_name that is passed during init or in run method as an optional parameter that overrides the init given. In the tracer this value can be looked up (we check for presence of LangfusePromptBuilder) and attach the prompt_name to the span. This way it'll show up in Langfuse trace website as you depicted above. LMK

@vblagoje
Copy link
Member

vblagoje commented Nov 11, 2024

I looked a bit more into this one and we could do the following in LangfuseChatPromptBuilder:

  • we extend our default 'ChatPromptBuilder`
  • current template: Optional[List[ChatMessage]] = None init param becomes template:Optional[Union[str, List[ChatMessage]]] = None to accommodate Langfuse prompt ids
  • run method template param also can be a str (for Langfuse prompt)
  • Only deal with Langfuse specific prompt loading in LangfuseChatPromptBuilder, delegate everything else to super class

In LangfuseTracer class from langfuse-haystack integration we introspect component to see if it is LangfuseChatPromptBuilder and if it is update the span with prompt_name data

Having said this I notice that having these explicit hardcoded component handlers in LangfuseTracer trace method is a definite smell and that we should have hooks to install handlers with custom span update functions both predefined and user defined provided via LangfuseConnector somehow. LMK

@alex-stoica
Copy link
Contributor Author

I will do a PR soon. I’ll start by trying LangfusePromptBuilder before moving to LangfuseChatPromptBuilder, as I think the initial prompt is much more critical in completion use cases rather than chat use cases.

Two points though:

  • Extending PromptBuilder / ChatPromptBuilder is a bit tricky due to the @component decorator; my implementation creates a PromptBuilder within LangfusePromptBuilder, but I’m considering some alternatives.
  • I see a need for "pipeline storage" within LangfuseTracer, as it would simplify avoiding multiple loads of the prompt object.

Regarding the second point basically the prompt obj is retrieved from the "storage":
image

        if tags.get("haystack.component.type") in _SUPPORTED_GENERATORS:
            meta = span._data.get("haystack.component.output", {}).get("meta")
            if meta:
                # Haystack returns one meta dict for each message, but the 'usage' value
                # is always the same, let's just pick the first item
                m = meta[0]
                span._span.update(usage=m.get("usage") or None, model=m.get("model"))

                if m.get("prompt_name") is not None:
                    prompt_name = m["prompt_name"]
                    prompt_obj = self.get_pipeline_run_context().get(prompt_name)
                    
                    if prompt_obj:
                        span._span.update(prompt=prompt_obj)

If this sounds good to you, I’ll proceed with the PR.

@vblagoje
Copy link
Member

@alex-stoica your approach seems a bit more custom to your use case. Perhaps we should hold off the PR in that case.

A few notes:

  • We have extended components on multiple occasions, simply add @component on subclass declaration - no major gotchas
  • We'll soon remove all generators and accompanying components (including non chat prompt components)
  • Going forward only ChatGenerators approach will be supported
  • Langfuse prompts are cached

@alex-stoica
Copy link
Contributor Author

Thank you for the clarification on prompt caching. This likely removes the necessity of retrieving prompt_obj from self.get_pipeline_run_context(); a direct call to prompt_obj = tracer.get_prompt(prompt_name) should suffice.

Regarding @component, I understand creating a new component is manageable and I've created multiple components in the past. However, when both base and subclass are decorated, handling super calls can be a bit more nuanced. To address this, I opted for composition in my local implementation. Specifically, I passed the prompt builder as a parameter within LangfusePromptBuilder and invoked it directly as needed. Here’s the code for reference

@component
class LangfusePromptBuilder():
    """
    Extends the default PromptBuilder to handle Langfuse prompts

    This component allows you to specify a prompt by its name in Langfuse,
    or by providing a prompt template string.

    If a Langfuse prompt name is provided, it loads the prompt from Langfuse,
    renders it using the provided variables, and returns the rendered prompt.
    """

    def __init__(
        self,
        template: Optional[str] = None,
        langfuse_prompt_name: Optional[str] = None,
        required_variables: Optional[List[str]] = None,
        variables: Optional[List[str]] = None,
    ):
        if template is not None and langfuse_prompt_name is not None:
            raise ValueError("Either one of 'template' or 'langfuse_prompt_name' should be provided.")
        
        self.required_variables = required_variables or []
        self.variables = variables or []
        
        self.langfuse_tracer = self.get_langfuse_tracer() 
        if langfuse_prompt_name:
            self.langfuse_prompt_name = langfuse_prompt_name
            self.prompt_obj = self.langfuse_tracer._tracer.get_prompt(langfuse_prompt_name)
            self._template_string = self.prompt_obj.prompt
            if not self._template_string:
                raise RuntimeError(f"Prompt '{langfuse_prompt_name}' not found in Langfuse.")
        elif template is not None:
            self._template_string = template
            self.langfuse_prompt_name = None
        else:
            raise ValueError("One of 'template' or 'langfuse_prompt_name' must be provided.")

        self.prompt_builder = PromptBuilder(
            template=self._template_string, 
            required_variables=required_variables, 
            variables=variables
        )

        self.set_input_types()

    def get_langfuse_tracer(self) -> LangfuseTracer: 
        langfuse_client = tracing.tracer.actual_tracer
        if isinstance(langfuse_client, LangfuseTracer):
            return langfuse_client
        else:
            raise RuntimeError("Tracer is not of type LangfuseTracer. Cannot proceed.")  

    def set_input_types(self):
        variables = self.extract_new_input_variables()
        for var, var_type in variables.items():
            component.set_input_type(self, var, var_type)

    def extract_new_input_variables(self):
        predefined_sockets = {"template", "template_variables"}
        new_input_variables = {}

        if hasattr(self.prompt_builder, "__haystack_input__"):
            if hasattr(self.prompt_builder.__haystack_input__, "_sockets_dict"):
                input_sockets = self.prompt_builder.__haystack_input__._sockets_dict
                for variable_name, socket in input_sockets.items():
                    if variable_name not in predefined_sockets:
                        new_input_variables[variable_name] = socket.type

        return new_input_variables


    @component.output_types(prompt=str, prompt_name=str)
    def run(
        self,
        template: Optional[str] = None,
        template_variables: Optional[Dict[str, Any]] = None,
        **kwargs,
    ):
        template_to_use = template or self._template_string  
        if not template_to_use:
            raise ValueError("No template provided to render the prompt.") 

        response = self.prompt_builder.run(
            template=template_to_use, 
            template_variables=template_variables, 
            **kwargs
        )
        prompt_obj = self.prompt_obj or ""

        # pb_context = {self.langfuse_prompt_name: prompt_obj} no longer required because of the caching 
        # self.langfuse_tracer.update_pipeline_run_context(**pb_context) no longer required because of the caching 

        response['prompt_name'] = self.langfuse_prompt_name
        return response
    

    def _validate_variables(self, provided_variables: Set[str]):
        missing_variables = [var for var in self.required_variables if var not in provided_variables]
        if missing_variables:
            missing_vars_str = ", ".join(missing_variables)
            raise ValueError(
                f"Missing required input variables in LangfusePromptBuilder: {missing_vars_str}. "
                f"Required variables: {self.required_variables}. Provided variables: {provided_variables}."
            )

    def to_dict(self) -> Dict[str, Any]:
        class_path = f"{self.__class__.__module__}.{self.__class__.__name__}"  
        init_params = {
            'required_variables': self.required_variables,
            'variables': self.variables
        }
        if self.langfuse_prompt_name is not None:
            init_params['langfuse_prompt_name'] = self.langfuse_prompt_name
        else:
            init_params['template'] = self._template_string
        data = {
            'type': class_path,
            'init_parameters': init_params
        }
        return data

    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> "LangfusePromptBuilder":
        init_params = data.get("init_parameters", {})
        return cls(
            template=init_params.get("template"),
            langfuse_prompt_name=init_params.get("langfuse_prompt_name"),
            required_variables=init_params.get("required_variables"),
            variables=init_params.get("variables"),
        )

@silvanocerza
Copy link
Contributor

@alex-stoica composition is a good approach in my opinion. 👍

@alex-stoica
Copy link
Contributor Author

@silvanocerza thanks!

@vblagoje I am more than willing to contribute, but there are some concerns

  1. Current Langfuse integration is failing I did a PR for this, but moving forward before solving this can be tricky
  2. You plan to remove generators: I don't know what this redesign will imply. I find it important to keep this issue in mind when removing them, though.

My proposed solutions for solving langfuse-prompt-obj <-> generation linkage are
A. creating a temporary storage inside the tracer (like a context variable) in the tracer + pushing / pulling from it. You said this is too specific for my use-case, so we can disregard it, but I personally see it as a little more extensible and elegant. Here we could store user id, session id etc.
B. pass variables through the generators in meta passing the prompt name as an additional parameter through the generator to ensure accessibility

Refactoring generators should leave an open way to implement B if this is the route you want to pursue

@vblagoje
Copy link
Member

Yes @alex-stoica let's address the parent span issue and the PR first. So in B solution we'd attach prompt name in ChatMessage meta but generator will return a new message that won't have that prompt name attached. Wouldn't that be a problem as we are tracing LLM response and tying it back to prompt metadata here? Perhaps I didn't understand you completely here...

@alex-stoica
Copy link
Contributor Author

alex-stoica commented Nov 13, 2024

Not quite, I might have explained unclearly. I see B as a way to pass the prompt_name unaltered (or maybe other extra inputs) in generators. For example:

response = ...run(**regular_inputs)
if extra_inputs: 
    response['meta']['extra_outputs'] = extra_inputs

In tracer

if m.get('extra_outputs'):
   # add prompt etc ...

In A

  • maybe some decorator that stores extra_inputs in a special place (a contextvar in tracer, I don't really have a concrte implementation idea in my mind). This would typically apply to prompt builders and allow prompt-related metadata to be accessed when needed
    In tracer
if self.get_extra_outputs_from_context(component_name):
   # add prompt etc ... 

If neither A nor B is implemented, another component like LangfuseGenerator should be created that does exactly what a regular generator does + A (receives more input and outputs that extra input as output in meta)

@LastRemote
Copy link
Contributor

Following up the ContextVars discussions in #1184 :
I would prefer to parse the information through meta instead of ContextVars. IMO ContextVars should be better used to track global attributes for each pipeline run, like session_id or user_id, but prompts are strongly bind to a particular generation. A complicated case would be that we have multiple prompt builders connecting to multiple chat generators, and some of them might not be executed in every pipeline run, and the relations between prompts and generations might be a little tricky if we use a ContextVars table.

I would need to think more about this but I'd assume having a higher-level component (which includes prompt builder & generator) might be the cleanest way to handle this, instead of retrieving inter-component relations on the generator/tracer end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Ideas to improve an integration integration:langfuse P3
Projects
None yet
Development

No branches or pull requests

6 participants