Exception raised in Job[1]: AttributeError('str' object has no attribute 'acomplete') #1768

waadmhj · 2024-12-17T12:06:05Z

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug
i have tried to downgrade to ragas 0.2.7, and tried to use llama3.2 for llm and embedding but the it keep give me an error :
Exception raised in Job[51]: AttributeError('str' object has no attribute 'aembed_documents')
Exception raised in Job[25]: AttributeError('str' object has no attribute 'acomplete')

Ragas version: both 0.2.8 and 0.2.7
Python version: 3.11.10

Code to Reproduce
from datasets import load_from_disk
dataset = load_from_disk("data")
from ragas import EvaluationDataset

eval_dataset = EvaluationDataset.from_hf_dataset(dataset["eval"])

from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness, SemanticSimilarity
from ragas import evaluate

from ragas.llms import LlamaIndexLLMWrapper
evaluator_llm = LlamaIndexLLMWrapper("0ssamaak0/silma-v1")

from ragas.embeddings import LangchainEmbeddingsWrapper
embeddings = LangchainEmbeddingsWrapper("silma-ai/silma-embedding-matryoshka-v0.1")

metrics = [
LLMContextRecall(llm=evaluator_llm),
FactualCorrectness(llm=evaluator_llm),
Faithfulness(llm=evaluator_llm),
SemanticSimilarity(embeddings=embeddings)
]
results = evaluate(dataset=eval_dataset, metrics=metrics)

Error trace
with ragas 0.2.7 it didn't give me an error trace but in 0.2.8 :

KeyError Traceback (most recent call last)
Cell In[6], line 7
1 metrics = [
2 LLMContextRecall(llm=evaluator_llm),
3 FactualCorrectness(llm=evaluator_llm),
4 Faithfulness(llm=evaluator_llm),
5 SemanticSimilarity(embeddings=embeddings)
6 ]
----> 7 results = evaluate(dataset=eval_dataset, metrics=metrics)

File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas_analytics.py:205, in track_was_completed..wrapper(*args, **kwargs)
202 @wraps(func)
203 def wrapper(*args: P.args, **kwargs: P.kwargs) -> t.Any:
204 track(IsCompleteEvent(event_type=func.name, is_completed=False))
--> 205 result = func(*args, **kwargs)
206 track(IsCompleteEvent(event_type=func.name, is_completed=True))
208 return result

File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas\evaluation.py:333, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, run_config, token_usage_parser, raise_exceptions, column_map, show_progress, batch_size, _run_id, _pbar)
329 else:
330 # evalution run was successful
331 # now lets process the results
332 cost_cb = ragas_callbacks["cost_cb"] if "cost_cb" in ragas_callbacks else None
--> 333 result = EvaluationResult(
334 scores=scores,
335 dataset=dataset,
336 binary_columns=binary_metrics,
337 cost_cb=t.cast(
338 t.Union["CostCallbackHandler", None],
339 cost_cb,
340 ),
341 ragas_traces=tracer.traces,
342 run_id=_run_id,
343 )
344 if not evaluation_group_cm.ended:
345 evaluation_rm.on_chain_end({"scores": result.scores})

File :10, in init(self, scores, dataset, binary_columns, cost_cb, traces, ragas_traces, run_id)

File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas\dataset_schema.py:410, in EvaluationResult.post_init(self)
408 # parse the traces
409 run_id = str(self.run_id) if self.run_id is not None else None
--> 410 self.traces = parse_run_traces(self.ragas_traces, run_id)

File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas\callbacks.py:167, in parse_run_traces(traces, parent_run_id)
163 for i, prompt_uuid in enumerate(metric_trace.children):
164 prompt_trace = traces[prompt_uuid]
165 prompt_traces[f"{prompt_trace.name}"] = {
166 "input": prompt_trace.inputs.get("data", {}),
--> 167 "output": prompt_trace.outputs.get("output", {})[0],
168 }
169 metric_traces[f"{metric_trace.name}"] = prompt_traces
170 parased_traces.append(metric_traces)

KeyError: 0
Expected behavior

it should give me something like that:

Additional context
Add any other context about the problem here.

waadmhj · 2024-12-18T06:13:30Z

Update

after changing the callbacks function from:
File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas\callbacks.py:167, in parse_run_traces(traces, parent_run_id)
163 for i, prompt_uuid in enumerate(metric_trace.children):
164 prompt_trace = traces[prompt_uuid]
165 prompt_traces[f"{prompt_trace.name}"] = {
166 "input": prompt_trace.inputs.get("data", {}),
--> 167 "output": prompt_trace.outputs.get("output", {})[0],
168 }
169 metric_traces[f"{metric_trace.name}"] = prompt_traces
170 parased_traces.append(metric_traces)

to this :
for i, prompt_uuid in enumerate(metric_trace.children):
prompt_trace = traces[prompt_uuid]
output = prompt_trace.outputs.get("output", {})
output = output[0] if isinstance(output, list) else output
prompt_traces[f"{prompt_trace.name}"] = {
"input": prompt_trace.inputs.get("data", {}),
"output": output,
}
metric_traces[f"{metric_trace.name}"] = prompt_traces
parased_traces.append(metric_traces)

the KeyError Traceback stopped appearing but the exception still there, and when i print the results this is what i get:
{'context_recall': nan, 'factual_correctness': nan, 'faithfulness': nan, 'semantic_similarity': nan}

Nom1rako · 2024-12-25T09:32:52Z

Hi, I had the same problem before, when I used llamaindexllmwrapper. Now it is solved, because I suddenly found that it can run easily when using langchainLLMwrapper in the evaluation. But make sure your llm is wrapped in langchain style.My code is as follows：

code with issue:

# ---------------------EVAL---------------------------
llm = OnlineOpenai(url=url_critic, headers=headers, model=model)
# import metrics
from ragas.metrics import (
    Faithfulness,
    AnswerRelevancy,
    ContextPrecision,
    ContextRecall,
)

# init metrics with evaluator LLM
from ragas.llms import LlamaIndexLLMWrapper

evaluator_llm = LlamaIndexLLMWrapper(llm)
metrics = [
    Faithfulness(llm=evaluator_llm),
    AnswerRelevancy(llm=evaluator_llm),
    ContextPrecision(llm=evaluator_llm),
    ContextRecall(llm=evaluator_llm),
]
# convert pandas DataFrame to Ragas Evaluation Dataset
# 确保 'reference_contexts' 列中的值是列表，而不是字符串
if df['reference_contexts'].dtype == object:
    try:
        df['reference_contexts'] = df['reference_contexts'].apply(literal_eval)
    except ValueError as e:
        print(f"Error parsing 'reference_contexts': {e}")
ragas_ds = EvaluationDataset.from_pandas(df)
print(ragas_ds) # EvaluationDataset(features=['user_input', 'reference_contexts', 'reference'], len=12)

from ragas.integrations.llama_index import evaluate

result = evaluate(
    query_engine=query_engine,
    metrics=metrics,
    dataset=ragas_ds,
    embeddings=hf_embedding,
)
# final scores
print(result)

res_df = result.to_pandas()
res_df.to_csv("llamaindex_eval_results.csv", index=False)
======================console==============================
Exception raised in Job[37]: AttributeError('OnlineOpenai' object has no attribute 'acomplete')
Evaluating:  92%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎           | 44/48 [06:26<00:22,  5.68s/it]Exception raised in Job[42]: AttributeError('OnlineOpenai' object has no attribute 'acomplete')
Evaluating:  94%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏        | 45/48 [06:26<00:12,  4.05s/it]Exception raised in Job[36]: TimeoutError()
Evaluating:  96%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏     | 46/48 [06:49<00:19,  9.59s/it]Exception raised in Job[45]: AttributeError('OnlineOpenai' object has no attribute 'acomplete')
Evaluating:  98%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████   | 47/48 [06:52<00:07,  7.90s/it]Exception raised in Job[44]: AttributeError('OnlineOpenai' object has no attribute 'acomplete')
Evaluating: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 48/48 [07:29<00:00,  9.37s/it]
{'faithfulness': nan, 'answer_relevancy': nan, 'context_precision': nan, 'context_recall': nan}

after change to `LangchainLLMWrapper(llm)`

……
# init metrics with evaluator LLM
from ragas.llms import LangchainLLMWrapper

evaluator_llm = LangchainLLMWrapper(llm)
metrics = [
    Faithfulness(llm=evaluator_llm),
    AnswerRelevancy(llm=evaluator_llm),
    ContextPrecision(llm=evaluator_llm),
    ContextRecall(llm=evaluator_llm),
]
……
========================console===============================
(llm_testzx) (base) [root@localhost eval_RAG]# python3 RAGAS_llamaIndex_eval.py 
/home/conda/envs/llm_testzx/lib/python3.10/site-packages/pydantic/_internal/_fields.py:172: UserWarning: Field name "stream" in "OnlineOpenai" shadows an attribute in parent "LLM"
  warnings.warn(
   Unnamed: 0                                         user_input  ...                                          reference                      synthesizer_name
0           0  What is the main contribution of Touvron et al...  ...  Touvron et al. (2023a) highlight the importanc...  single_hop_specifc_query_synthesizer
1           1  What are the key architectural features of the...  ...  The Llama models are based on the decoder-only...  single_hop_specifc_query_synthesizer
2           2     Who is Tu in the research paper about Minicpm?  ...  Tu, Y. is one of the authors of the research p...  single_hop_specifc_query_synthesizer
3           3                   who is naren in xformers librar?  ...  Naren, S. is one of the contributors to the xf...  single_hop_specifc_query_synthesizer
4           4  How does the architecture of TinyLlama, an inf...  ...  TinyLlama, an inference-optimal language model...  multi_hop_abstract_query_synthesizer

[5 rows x 5 columns]
EvaluationDataset(features=['user_input', 'reference_contexts', 'reference'], len=12)
Running Query Engine:   0%|                                                                                                                                            | 0/12 [00:00<?, ?it/s]/home/conda/envs/llm_testzx/lib/python3.10/site-packages/llama_index/llms/langchain/base.py:106: LangChainDeprecationWarning: The method `BaseLLM.predict` was deprecated in langchain-core 0.1.7 and will be removed in 1.0. Use :meth:`~invoke` instead.
  output_str = self._llm.predict(prompt, **kwargs)
Running Query Engine: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [02:50<00:00, 14.19s/it]
Evaluating:  85%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                    | 41/48 [02:56<01:54, 16.29s/it]Exception raised in Job[4]: TimeoutError()
Evaluating:  92%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎           | 44/48 [03:07<00:32,  8.06s/it]Exception raised in Job[16]: TimeoutError()
Evaluating:  96%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏     | 46/48 [03:17<00:12,  6.32s/it]Exception raised in Job[20]: TimeoutError()
Evaluating: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 48/48 [03:45<00:00,  4.70s/it]
{'faithfulness': 0.8800, 'answer_relevancy': 0.9157, 'context_precision': 0.9583, 'context_recall': 0.9556}

waadmhj added the bug Something isn't working label Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exception raised in Job[1]: AttributeError('str' object has no attribute 'acomplete') #1768

Exception raised in Job[1]: AttributeError('str' object has no attribute 'acomplete') #1768

waadmhj commented Dec 17, 2024

waadmhj commented Dec 18, 2024

Nom1rako commented Dec 25, 2024 •

edited

Loading

Exception raised in Job[1]: AttributeError('str' object has no attribute 'acomplete') #1768

Exception raised in Job[1]: AttributeError('str' object has no attribute 'acomplete') #1768

Comments

waadmhj commented Dec 17, 2024

waadmhj commented Dec 18, 2024

Nom1rako commented Dec 25, 2024 • edited Loading

code with issue:

after change to LangchainLLMWrapper(llm)

Nom1rako commented Dec 25, 2024 •

edited

Loading

after change to `LangchainLLMWrapper(llm)`