Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_qa_with_reference is always None where as get_retrieved_documents works fine. #3408

Open
axiomofjoy opened this issue Jun 7, 2024 · 9 comments
Assignees
Labels
bug Something isn't working cannot reproduce A bug that cannot be reproduced stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed

Comments

@axiomofjoy
Copy link
Contributor

          get_qa_with_reference is always None where as get_retrieved_documents works fine.

Always get No spans found.
File "C:\anaconda3\Lib\site-packages\phoenix\evals\classify.py", line 354, in run_evals
total_records = len(dataframe)
^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()

Anyone else facing the same issue?

Originally posted by @snagrecha in #2647 (comment)

@axiomofjoy
Copy link
Contributor Author

@snagrecha Can you ensure you are passing in a non-null dataframe?

@dosubot dosubot bot added the bug Something isn't working label Jun 7, 2024
@snagrecha
Copy link

@axiomofjoy axiomofjoy mentioned this issue Jun 7, 2024

Hi, Heres the code being used. (It is a single query to the RAG pipeline which fetches a response. The query is generated via user input on console.)

def run_simple_rag():
    app = create_simple_rag_workflow()
    question = input("Enter query: ")
    state = {"question": question}
    response = app.invoke(state)
    print(response['generation'])
    span_df = px.Client().get_spans_dataframe()
    queries_df = get_qa_with_reference(px.Client())
    retrieved_documents_df = get_retrieved_documents(px.Client())

    print('spans_df: \n', span_df)
    print('queries_df: \n', queries_df)
    print('retrieved_documents_df: \n', retrieved_documents_df)

Following is the output:

C:\Users\anaconda3\Lib\site-packages\phoenix\trace\dsl\query.py:746: FutureWarning: Series.getitem treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use ser.iloc[pos]
df_attributes = pd.DataFrame.from_records(
No spans found.

spans_df:
name ... attributes.llm.prompt_template.variables
context.span_id ...
5b2d9bc299a7a78b start ... None
432d42b97f1f14db Retriever ... None
de05635eecefab29 ChannelWrite<retrieve,question,generation,docu... ... None
3f5030063de0cb6a retrieve ... None
5b3d4f2558f82e95 PromptTemplate ... None
97cea28505636399 ChatOllama ... None
d6b33aeba66a6019 StrOutputParser ... None
b9abcf17de11fdbd RunnableSequence ... {'context': ['page_content='RAG-FUSION: A NEW ...

[8 rows x 23 columns]

queries_df:
None

retrieved_documents_df:
context.trace_id ... reference
context.span_id document_position ...
432d42b97f1f14db 0 3c9587db7dfa3e83ba0959701964783e ... RAG-FUSION: A NEW TAKE ON RETRIEVAL-AUGMENTED...
1 3c9587db7dfa3e83ba0959701964783e ... RAG-Fusion: a New Take on Retrieval-Augmented ...
2 3c9587db7dfa3e83ba0959701964783e ... RAG-Fusion: a New Take on Retrieval-Augmented ...
3 3c9587db7dfa3e83ba0959701964783e ... RAG-Fusion: a New Take on Retrieval-Augmented ...

[4 rows x 3 columns]

And here is the phoenix output:
phoenix

@axiomofjoy
Copy link
Contributor Author

axiomofjoy commented Jun 10, 2024

Thank you for the reply @snagrecha. It's tough to understand the topology of your traces from the code snippet you sent. It looks like you're using LangChain from your screenshot. Can you point me toward the particular chain you are using?

@axiomofjoy axiomofjoy added the cannot reproduce A bug that cannot be reproduced label Jun 10, 2024
@snagrecha
Copy link

Thank you for the reply @snagrecha. It's tough to understand the topology of your traces from the code snippet you sent. It looks like you're using LangChain from your screenshot. Can you point me toward the particular chain you are using?

Hi @axiomofjoy sorry for the rather abstract snippet. Here is the chain and a snapshot of the Traces to better understand the flow.

def create_simple_rag_workflow():

    workflow = StateGraph(GraphState)

    workflow.add_node("retrieve", retrieve)
    workflow.add_node("generate", generate)

    workflow.set_entry_point("retrieve")
    workflow.add_edge("retrieve", "generate")
    workflow.add_edge("generate", END)

    app = workflow.compile()

    return app
def run_simple_rag():
    app = create_simple_rag_workflow()
    question = input("Enter query: ")
    state = {"question": question}
    response = app.invoke(state)
    print(response['generation'])
    run_eval()
def run_eval():
    eval_model = LiteLLMModel(model='ollama/llama3')
    
    hallucination_evaluator = HallucinationEvaluator(eval_model)
    qa_correctness_eval_df = QAEvaluator(eval_model)
    relevance_evaluator = RelevanceEvaluator(eval_model)

    queries_df = get_qa_with_reference(px.Client())
    retrieved_documents_df = get_retrieved_documents(px.Client())

    hallucination_eval_df = run_evals(
        dataframe=queries_df,
        evaluators=[hallucination_evaluator],
        provide_explanation=True,
    )

    relevance_eval_df = run_evals(
        dataframe=retrieved_documents_df,
        evaluators=[relevance_evaluator],
        provide_explanation=True,
    )[0]

    px.Client().log_evaluations(
        SpanEvaluations(eval_name="Hallucination", dataframe=hallucination_eval_df[0]),
        SpanEvaluations(eval_name="QA Correctness", dataframe=qa_correctness_eval_df),
        DocumentEvaluations(eval_name="Relevance", dataframe=relevance_eval_df),
    )

Error traceback:

Enter query: what is rag fusion
NODE: retrieve
***Retrieving relevant documents
NODE: generate
Generating response from Llama
NODE: running arize-phoenix eval
C:\Users\anaconda3\Lib\site-packages\phoenix\trace\dsl\query.py:746: FutureWarning: Series.getitem treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use ser.iloc[pos]
df_attributes = pd.DataFrame.from_records(
No spans found.
WARNI [phoenix.evals.executors] Async evals execution is not supported in non-main threads. Falling back to sync.
Traceback (most recent call last):
File "C:\RAGS\main.py", line 66, in
run_simple_rag()
File "C:\RAGS\simple_rag.py", line 264, in run_simple_rag
response = app.invoke(state)
^^^^^^^^^^^^^^^^^
File "C:\Users\anaconda3\Lib\site-packages\langgraph\pregel_init_.py", line 1400, in invoke
for chunk in self.stream(
File "C:\Users\anaconda3\Lib\site-packages\langgraph\pregel_init_.py", line 963, in stream
panic_or_proceed(done, inflight, step)
File "C:\Users\anaconda3\Lib\site-packages\langgraph\pregel_init
.py", line 1489, in _panic_or_proceed
raise exc
File "C:\Users\anaconda3\Lib\concurrent\futures\thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\anaconda3\Lib\site-packages\langgraph\pregel\retry.py", line 66, in run_with_retry
task.proc.invoke(task.input, task.config)
File "C:\Users\anaconda3\Lib\site-packages\langchain_core\runnables\base.py", line 2399, in invoke
input = step.invoke(
^^^^^^^^^^^^
File "C:\Users\anaconda3\Lib\site-packages\langgraph\utils.py", line 95, in invoke
ret = context.run(self.func, input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PycharmProjects\RAGS\simple_rag.py", line 61, in run_eval
hallucination_eval_df, qa_correctness_eval_df = run_evals(
^^^^^^^^^^
File "C:\Users\anaconda3\Lib\site-packages\phoenix\evals\classify.py", line 354, in run_evals
total_records = len(dataframe)
^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()

Below is the screenshot of arize-phoenix traces without running eval:
image

@axiomofjoy axiomofjoy self-assigned this Jun 11, 2024
@axiomofjoy axiomofjoy moved this from 📘 Todo to 👨‍💻 In progress in phoenix Jun 12, 2024
@axiomofjoy
Copy link
Contributor Author

Hey @snagrecha, thank you for your patience and thanks for the detailed scripts.

As background, the way get_qa_with_reference works is it will grab the input.value and output.value from the attributes on the root span of a trace, then it will grab the retrieval.documents from any retriever spans, and it will join them together. Previous versions of Phoenix required the retriever span to be a direct descendant of the root span, but in the current version, the retriever span can be any descendant (not necessarily direct). From the look of your traces, it appears that your retriever span is not a direct child of the root span, and I suspect that you might be running an old version of Phoenix. Can you try upgrading to arize-phoenix==4.4.1 and see if the error persists?

If the error persists after upgrading, can you send a script I can run for the chain? Your example above does not include definitions for the retrieve and generate nodes.

Thanks for your help.

@axiomofjoy axiomofjoy moved this from 👨‍💻 In progress to 📘 Todo in phoenix Jun 12, 2024
@snagrecha
Copy link

Hi @axiomofjoy, sorry for the late response.

The arize-phoenix version that I am using is 4.4.1 as suggested by you. The error still persists. However I think I have been able to narrow down the source.

qa_query = SpanQuery().select("span_id", **IO).where(IS_ROOT).with_index("trace_id")

This query seems to be the culprit which results in empty df being returned causing all the problem.

@axiomofjoy
Copy link
Contributor Author

Hey @snagrecha. This query for root spans coming back empty suggests your project either has no traces or is missing the root span for each trace. I am guessing your project is not empty if get_retrieved_documents is returning spans.

If you send a snippet to reproduce the issue, I'm happy to debug the issue.

@baichuan0113
Copy link

Hey @snagrecha. This query for root spans coming back empty suggests your project either has no traces or is missing the root span for each trace. I am guessing your project is not empty if get_retrieved_documents is returning spans.

If you send a snippet to reproduce the issue, I'm happy to debug the issue.

Hi, I think I have the same issue, both the queries_df and get_retrieved_documents(px.active_session()) is empty/None, is it possible that is because the input and output has some attribute instead of a direct value? If so is it possible to solve it? Also I noticed that for langchain tutorial it only has RetrievalQA, is it possible there is a tutorial about LLMChain? Thanks, and here is my example input/output image.
image

Copy link

dosubot bot commented Nov 30, 2024

Hi, @axiomofjoy. I'm Dosu, and I'm helping the Arize Phoenix team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • The get_qa_with_reference function returns None, leading to a TypeError when determining the length of a NoneType.
  • User snagrecha provided insights suggesting the issue might relate to trace structure and Phoenix version.
  • Upgrading to arize-phoenix==4.4.1 did not resolve the issue.
  • A query for root spans was identified as a potential cause, with similar issues noted by baichuan0113.
  • Further debugging and a reproducible script were requested but remain outstanding.

Next Steps:

  • Please confirm if this issue is still relevant with the latest version of Arize Phoenix. If so, you can keep the discussion open by commenting here.
  • If there is no further activity, this issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

@dosubot dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cannot reproduce A bug that cannot be reproduced stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed
Projects
Status: 📘 Todo
Development

No branches or pull requests

3 participants