get_qa_with_reference is always None where as get_retrieved_documents works fine. #3408

axiomofjoy · 2024-06-07T16:56:35Z

          get_qa_with_reference is always None where as get_retrieved_documents works fine.

Always get No spans found.
File "C:\anaconda3\Lib\site-packages\phoenix\evals\classify.py", line 354, in run_evals
total_records = len(dataframe)
^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()

Anyone else facing the same issue?

Originally posted by @snagrecha in #2647 (comment)

The text was updated successfully, but these errors were encountered:

axiomofjoy · 2024-06-07T16:57:53Z

@snagrecha Can you ensure you are passing in a non-null dataframe?

snagrecha · 2024-06-10T15:37:02Z

axiomofjoy mentioned this issue Jun 7, 2024

Hi, Heres the code being used. (It is a single query to the RAG pipeline which fetches a response. The query is generated via user input on console.)

def run_simple_rag():
    app = create_simple_rag_workflow()
    question = input("Enter query: ")
    state = {"question": question}
    response = app.invoke(state)
    print(response['generation'])
    span_df = px.Client().get_spans_dataframe()
    queries_df = get_qa_with_reference(px.Client())
    retrieved_documents_df = get_retrieved_documents(px.Client())

    print('spans_df: \n', span_df)
    print('queries_df: \n', queries_df)
    print('retrieved_documents_df: \n', retrieved_documents_df)

Following is the output:

C:\Users\anaconda3\Lib\site-packages\phoenix\trace\dsl\query.py:746: FutureWarning: Series.getitem treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use ser.iloc[pos]
df_attributes = pd.DataFrame.from_records(
No spans found.

spans_df:
name ... attributes.llm.prompt_template.variables
context.span_id ...
5b2d9bc299a7a78b start ... None
432d42b97f1f14db Retriever ... None
de05635eecefab29 ChannelWrite<retrieve,question,generation,docu... ... None
3f5030063de0cb6a retrieve ... None
5b3d4f2558f82e95 PromptTemplate ... None
97cea28505636399 ChatOllama ... None
d6b33aeba66a6019 StrOutputParser ... None
b9abcf17de11fdbd RunnableSequence ... {'context': ['page_content='RAG-FUSION: A NEW ...

[8 rows x 23 columns]

queries_df:
None

retrieved_documents_df:
context.trace_id ... reference
context.span_id document_position ...
432d42b97f1f14db 0 3c9587db7dfa3e83ba0959701964783e ... RAG-FUSION: A NEW TAKE ON RETRIEVAL-AUGMENTED...
1 3c9587db7dfa3e83ba0959701964783e ... RAG-Fusion: a New Take on Retrieval-Augmented ...
2 3c9587db7dfa3e83ba0959701964783e ... RAG-Fusion: a New Take on Retrieval-Augmented ...
3 3c9587db7dfa3e83ba0959701964783e ... RAG-Fusion: a New Take on Retrieval-Augmented ...

[4 rows x 3 columns]

And here is the phoenix output:

axiomofjoy · 2024-06-10T18:31:32Z

Thank you for the reply @snagrecha. It's tough to understand the topology of your traces from the code snippet you sent. It looks like you're using LangChain from your screenshot. Can you point me toward the particular chain you are using?

snagrecha · 2024-06-11T05:53:29Z

Thank you for the reply @snagrecha. It's tough to understand the topology of your traces from the code snippet you sent. It looks like you're using LangChain from your screenshot. Can you point me toward the particular chain you are using?

Hi @axiomofjoy sorry for the rather abstract snippet. Here is the chain and a snapshot of the Traces to better understand the flow.

def create_simple_rag_workflow():

    workflow = StateGraph(GraphState)

    workflow.add_node("retrieve", retrieve)
    workflow.add_node("generate", generate)

    workflow.set_entry_point("retrieve")
    workflow.add_edge("retrieve", "generate")
    workflow.add_edge("generate", END)

    app = workflow.compile()

    return app

def run_simple_rag():
    app = create_simple_rag_workflow()
    question = input("Enter query: ")
    state = {"question": question}
    response = app.invoke(state)
    print(response['generation'])
    run_eval()

def run_eval():
    eval_model = LiteLLMModel(model='ollama/llama3')
    
    hallucination_evaluator = HallucinationEvaluator(eval_model)
    qa_correctness_eval_df = QAEvaluator(eval_model)
    relevance_evaluator = RelevanceEvaluator(eval_model)

    queries_df = get_qa_with_reference(px.Client())
    retrieved_documents_df = get_retrieved_documents(px.Client())

    hallucination_eval_df = run_evals(
        dataframe=queries_df,
        evaluators=[hallucination_evaluator],
        provide_explanation=True,
    )

    relevance_eval_df = run_evals(
        dataframe=retrieved_documents_df,
        evaluators=[relevance_evaluator],
        provide_explanation=True,
    )[0]

    px.Client().log_evaluations(
        SpanEvaluations(eval_name="Hallucination", dataframe=hallucination_eval_df[0]),
        SpanEvaluations(eval_name="QA Correctness", dataframe=qa_correctness_eval_df),
        DocumentEvaluations(eval_name="Relevance", dataframe=relevance_eval_df),
    )

Error traceback:

Enter query: what is rag fusion
NODE: retrieve
***Retrieving relevant documents
NODE: generate
Generating response from Llama
NODE: running arize-phoenix eval
C:\Users\anaconda3\Lib\site-packages\phoenix\trace\dsl\query.py:746: FutureWarning: Series.getitem treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use ser.iloc[pos]
df_attributes = pd.DataFrame.from_records(
No spans found.
WARNI [phoenix.evals.executors] Async evals execution is not supported in non-main threads. Falling back to sync.
Traceback (most recent call last):
File "C:\RAGS\main.py", line 66, in
run_simple_rag()
File "C:\RAGS\simple_rag.py", line 264, in run_simple_rag
response = app.invoke(state)
^^^^^^^^^^^^^^^^^
File "C:\Users\anaconda3\Lib\site-packages\langgraph\pregel_init_.py", line 1400, in invoke
for chunk in self.stream(
File "C:\Users\anaconda3\Lib\site-packages\langgraph\pregel_init_.py", line 963, in stream
panic_or_proceed(done, inflight, step)
File "C:\Users\anaconda3\Lib\site-packages\langgraph\pregel_init.py", line 1489, in _panic_or_proceed
raise exc
File "C:\Users\anaconda3\Lib\concurrent\futures\thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\anaconda3\Lib\site-packages\langgraph\pregel\retry.py", line 66, in run_with_retry
task.proc.invoke(task.input, task.config)
File "C:\Users\anaconda3\Lib\site-packages\langchain_core\runnables\base.py", line 2399, in invoke
input = step.invoke(
^^^^^^^^^^^^
File "C:\Users\anaconda3\Lib\site-packages\langgraph\utils.py", line 95, in invoke
ret = context.run(self.func, input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PycharmProjects\RAGS\simple_rag.py", line 61, in run_eval
hallucination_eval_df, qa_correctness_eval_df = run_evals(
^^^^^^^^^^
File "C:\Users\anaconda3\Lib\site-packages\phoenix\evals\classify.py", line 354, in run_evals
total_records = len(dataframe)
^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()

Below is the screenshot of arize-phoenix traces without running eval:

axiomofjoy · 2024-06-12T20:35:13Z

Hey @snagrecha, thank you for your patience and thanks for the detailed scripts.

As background, the way get_qa_with_reference works is it will grab the input.value and output.value from the attributes on the root span of a trace, then it will grab the retrieval.documents from any retriever spans, and it will join them together. Previous versions of Phoenix required the retriever span to be a direct descendant of the root span, but in the current version, the retriever span can be any descendant (not necessarily direct). From the look of your traces, it appears that your retriever span is not a direct child of the root span, and I suspect that you might be running an old version of Phoenix. Can you try upgrading to arize-phoenix==4.4.1 and see if the error persists?

If the error persists after upgrading, can you send a script I can run for the chain? Your example above does not include definitions for the retrieve and generate nodes.

Thanks for your help.

snagrecha · 2024-06-21T10:58:10Z

Hi @axiomofjoy, sorry for the late response.

The arize-phoenix version that I am using is 4.4.1 as suggested by you. The error still persists. However I think I have been able to narrow down the source.

qa_query = SpanQuery().select("span_id", **IO).where(IS_ROOT).with_index("trace_id")

This query seems to be the culprit which results in empty df being returned causing all the problem.

axiomofjoy · 2024-06-21T14:43:22Z

Hey @snagrecha. This query for root spans coming back empty suggests your project either has no traces or is missing the root span for each trace. I am guessing your project is not empty if get_retrieved_documents is returning spans.

If you send a snippet to reproduce the issue, I'm happy to debug the issue.

baichuan0113 · 2024-08-01T07:44:42Z

Hey @snagrecha. This query for root spans coming back empty suggests your project either has no traces or is missing the root span for each trace. I am guessing your project is not empty if get_retrieved_documents is returning spans.

If you send a snippet to reproduce the issue, I'm happy to debug the issue.

Hi, I think I have the same issue, both the queries_df and get_retrieved_documents(px.active_session()) is empty/None, is it possible that is because the input and output has some attribute instead of a direct value? If so is it possible to solve it? Also I noticed that for langchain tutorial it only has RetrievalQA, is it possible there is a tutorial about LLMChain? Thanks, and here is my example input/output image.

dosubot · 2024-11-30T16:04:19Z

Hi, @axiomofjoy. I'm Dosu, and I'm helping the Arize Phoenix team manage their backlog. I'm marking this issue as stale.

Issue Summary:

The get_qa_with_reference function returns None, leading to a TypeError when determining the length of a NoneType.
User snagrecha provided insights suggesting the issue might relate to trace structure and Phoenix version.
Upgrading to arize-phoenix==4.4.1 did not resolve the issue.
A query for root spans was identified as a potential cause, with similar issues noted by baichuan0113.
Further debugging and a reproducible script were requested but remain outstanding.

Next Steps:

Please confirm if this issue is still relevant with the latest version of Arize Phoenix. If so, you can keep the discussion open by commenting here.
If there is no further activity, this issue will be automatically closed in 7 days.

Thank you for your understanding and contribution!

github-project-automation bot added this to phoenix Jun 7, 2024

github-project-automation bot moved this to 📘 Todo in phoenix Jun 7, 2024

dosubot bot added the bug Something isn't working label Jun 7, 2024

axiomofjoy added the cannot reproduce A bug that cannot be reproduced label Jun 10, 2024

axiomofjoy self-assigned this Jun 11, 2024

axiomofjoy moved this from 📘 Todo to 👨‍💻 In progress in phoenix Jun 12, 2024

axiomofjoy moved this from 👨‍💻 In progress to 📘 Todo in phoenix Jun 12, 2024

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_qa_with_reference is always None where as get_retrieved_documents works fine. #3408

get_qa_with_reference is always None where as get_retrieved_documents works fine. #3408

axiomofjoy commented Jun 7, 2024

axiomofjoy commented Jun 7, 2024

snagrecha commented Jun 10, 2024

axiomofjoy commented Jun 10, 2024 •

edited

Loading

snagrecha commented Jun 11, 2024

axiomofjoy commented Jun 12, 2024

snagrecha commented Jun 21, 2024

axiomofjoy commented Jun 21, 2024

baichuan0113 commented Aug 1, 2024

dosubot bot commented Nov 30, 2024

get_qa_with_reference is always None where as get_retrieved_documents works fine. #3408

get_qa_with_reference is always None where as get_retrieved_documents works fine. #3408

Comments

axiomofjoy commented Jun 7, 2024

axiomofjoy commented Jun 7, 2024

snagrecha commented Jun 10, 2024

axiomofjoy commented Jun 10, 2024 • edited Loading

snagrecha commented Jun 11, 2024

axiomofjoy commented Jun 12, 2024

snagrecha commented Jun 21, 2024

axiomofjoy commented Jun 21, 2024

baichuan0113 commented Aug 1, 2024

dosubot bot commented Nov 30, 2024

axiomofjoy commented Jun 10, 2024 •

edited

Loading