Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception raised in Job[1]: AttributeError('str' object has no attribute 'acomplete') #1768

Open
waadmhj opened this issue Dec 17, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@waadmhj
Copy link

waadmhj commented Dec 17, 2024

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug
i have tried to downgrade to ragas 0.2.7, and tried to use llama3.2 for llm and embedding but the it keep give me an error :
Exception raised in Job[51]: AttributeError('str' object has no attribute 'aembed_documents')
Exception raised in Job[25]: AttributeError('str' object has no attribute 'acomplete')

Ragas version: both 0.2.8 and 0.2.7
Python version: 3.11.10

Code to Reproduce
from datasets import load_from_disk
dataset = load_from_disk("data")
from ragas import EvaluationDataset

eval_dataset = EvaluationDataset.from_hf_dataset(dataset["eval"])

from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness, SemanticSimilarity
from ragas import evaluate

from ragas.llms import LlamaIndexLLMWrapper
evaluator_llm = LlamaIndexLLMWrapper("0ssamaak0/silma-v1")

from ragas.embeddings import LangchainEmbeddingsWrapper
embeddings = LangchainEmbeddingsWrapper("silma-ai/silma-embedding-matryoshka-v0.1")

metrics = [
LLMContextRecall(llm=evaluator_llm),
FactualCorrectness(llm=evaluator_llm),
Faithfulness(llm=evaluator_llm),
SemanticSimilarity(embeddings=embeddings)
]
results = evaluate(dataset=eval_dataset, metrics=metrics)

Error trace
with ragas 0.2.7 it didn't give me an error trace but in 0.2.8 :


KeyError Traceback (most recent call last)
Cell In[6], line 7
1 metrics = [
2 LLMContextRecall(llm=evaluator_llm),
3 FactualCorrectness(llm=evaluator_llm),
4 Faithfulness(llm=evaluator_llm),
5 SemanticSimilarity(embeddings=embeddings)
6 ]
----> 7 results = evaluate(dataset=eval_dataset, metrics=metrics)

File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas_analytics.py:205, in track_was_completed..wrapper(*args, **kwargs)
202 @wraps(func)
203 def wrapper(*args: P.args, **kwargs: P.kwargs) -> t.Any:
204 track(IsCompleteEvent(event_type=func.name, is_completed=False))
--> 205 result = func(*args, **kwargs)
206 track(IsCompleteEvent(event_type=func.name, is_completed=True))
208 return result

File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas\evaluation.py:333, in evaluate(dataset, metrics, llm, embeddings, callbacks, in_ci, run_config, token_usage_parser, raise_exceptions, column_map, show_progress, batch_size, _run_id, _pbar)
329 else:
330 # evalution run was successful
331 # now lets process the results
332 cost_cb = ragas_callbacks["cost_cb"] if "cost_cb" in ragas_callbacks else None
--> 333 result = EvaluationResult(
334 scores=scores,
335 dataset=dataset,
336 binary_columns=binary_metrics,
337 cost_cb=t.cast(
338 t.Union["CostCallbackHandler", None],
339 cost_cb,
340 ),
341 ragas_traces=tracer.traces,
342 run_id=_run_id,
343 )
344 if not evaluation_group_cm.ended:
345 evaluation_rm.on_chain_end({"scores": result.scores})

File :10, in init(self, scores, dataset, binary_columns, cost_cb, traces, ragas_traces, run_id)

File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas\dataset_schema.py:410, in EvaluationResult.post_init(self)
408 # parse the traces
409 run_id = str(self.run_id) if self.run_id is not None else None
--> 410 self.traces = parse_run_traces(self.ragas_traces, run_id)

File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas\callbacks.py:167, in parse_run_traces(traces, parent_run_id)
163 for i, prompt_uuid in enumerate(metric_trace.children):
164 prompt_trace = traces[prompt_uuid]
165 prompt_traces[f"{prompt_trace.name}"] = {
166 "input": prompt_trace.inputs.get("data", {}),
--> 167 "output": prompt_trace.outputs.get("output", {})[0],
168 }
169 metric_traces[f"{metric_trace.name}"] = prompt_traces
170 parased_traces.append(metric_traces)

KeyError: 0
Expected behavior

it should give me something like that:
image

Additional context
Add any other context about the problem here.

@waadmhj waadmhj added the bug Something isn't working label Dec 17, 2024
@waadmhj
Copy link
Author

waadmhj commented Dec 18, 2024

Update

after changing the callbacks function from:
File ~\AppData\Local\anaconda31\envs\notebook_root\Lib\site-packages\ragas\callbacks.py:167, in parse_run_traces(traces, parent_run_id)
163 for i, prompt_uuid in enumerate(metric_trace.children):
164 prompt_trace = traces[prompt_uuid]
165 prompt_traces[f"{prompt_trace.name}"] = {
166 "input": prompt_trace.inputs.get("data", {}),
--> 167 "output": prompt_trace.outputs.get("output", {})[0],
168 }
169 metric_traces[f"{metric_trace.name}"] = prompt_traces
170 parased_traces.append(metric_traces)

to this :
for i, prompt_uuid in enumerate(metric_trace.children):
prompt_trace = traces[prompt_uuid]
output = prompt_trace.outputs.get("output", {})
output = output[0] if isinstance(output, list) else output
prompt_traces[f"{prompt_trace.name}"] = {
"input": prompt_trace.inputs.get("data", {}),
"output": output,
}
metric_traces[f"{metric_trace.name}"] = prompt_traces
parased_traces.append(metric_traces)

the KeyError Traceback stopped appearing but the exception still there, and when i print the results this is what i get:
{'context_recall': nan, 'factual_correctness': nan, 'faithfulness': nan, 'semantic_similarity': nan}

@Nom1rako
Copy link

Nom1rako commented Dec 25, 2024

Hi, I had the same problem before, when I used llamaindexllmwrapper. Now it is solved, because I suddenly found that it can run easily when using langchainLLMwrapper in the evaluation. But make sure your llm is wrapped in langchain style.My code is as follows:

code with issue:

# ---------------------EVAL---------------------------
llm = OnlineOpenai(url=url_critic, headers=headers, model=model)
# import metrics
from ragas.metrics import (
    Faithfulness,
    AnswerRelevancy,
    ContextPrecision,
    ContextRecall,
)

# init metrics with evaluator LLM
from ragas.llms import LlamaIndexLLMWrapper

evaluator_llm = LlamaIndexLLMWrapper(llm)
metrics = [
    Faithfulness(llm=evaluator_llm),
    AnswerRelevancy(llm=evaluator_llm),
    ContextPrecision(llm=evaluator_llm),
    ContextRecall(llm=evaluator_llm),
]
# convert pandas DataFrame to Ragas Evaluation Dataset
# 确保 'reference_contexts' 列中的值是列表,而不是字符串
if df['reference_contexts'].dtype == object:
    try:
        df['reference_contexts'] = df['reference_contexts'].apply(literal_eval)
    except ValueError as e:
        print(f"Error parsing 'reference_contexts': {e}")
ragas_ds = EvaluationDataset.from_pandas(df)
print(ragas_ds) # EvaluationDataset(features=['user_input', 'reference_contexts', 'reference'], len=12)

from ragas.integrations.llama_index import evaluate

result = evaluate(
    query_engine=query_engine,
    metrics=metrics,
    dataset=ragas_ds,
    embeddings=hf_embedding,
)
# final scores
print(result)

res_df = result.to_pandas()
res_df.to_csv("llamaindex_eval_results.csv", index=False)
======================console==============================
Exception raised in Job[37]: AttributeError('OnlineOpenai' object has no attribute 'acomplete')
Evaluating:  92%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎           | 44/48 [06:26<00:22,  5.68s/it]Exception raised in Job[42]: AttributeError('OnlineOpenai' object has no attribute 'acomplete')
Evaluating:  94%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏        | 45/48 [06:26<00:12,  4.05s/it]Exception raised in Job[36]: TimeoutError()
Evaluating:  96%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏     | 46/48 [06:49<00:19,  9.59s/it]Exception raised in Job[45]: AttributeError('OnlineOpenai' object has no attribute 'acomplete')
Evaluating:  98%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████   | 47/48 [06:52<00:07,  7.90s/it]Exception raised in Job[44]: AttributeError('OnlineOpenai' object has no attribute 'acomplete')
Evaluating: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 48/48 [07:29<00:00,  9.37s/it]
{'faithfulness': nan, 'answer_relevancy': nan, 'context_precision': nan, 'context_recall': nan}

after change to LangchainLLMWrapper(llm)

……
# init metrics with evaluator LLM
from ragas.llms import LangchainLLMWrapper

evaluator_llm = LangchainLLMWrapper(llm)
metrics = [
    Faithfulness(llm=evaluator_llm),
    AnswerRelevancy(llm=evaluator_llm),
    ContextPrecision(llm=evaluator_llm),
    ContextRecall(llm=evaluator_llm),
]
……
========================console===============================
(llm_testzx) (base) [root@localhost eval_RAG]# python3 RAGAS_llamaIndex_eval.py 
/home/conda/envs/llm_testzx/lib/python3.10/site-packages/pydantic/_internal/_fields.py:172: UserWarning: Field name "stream" in "OnlineOpenai" shadows an attribute in parent "LLM"
  warnings.warn(
   Unnamed: 0                                         user_input  ...                                          reference                      synthesizer_name
0           0  What is the main contribution of Touvron et al...  ...  Touvron et al. (2023a) highlight the importanc...  single_hop_specifc_query_synthesizer
1           1  What are the key architectural features of the...  ...  The Llama models are based on the decoder-only...  single_hop_specifc_query_synthesizer
2           2     Who is Tu in the research paper about Minicpm?  ...  Tu, Y. is one of the authors of the research p...  single_hop_specifc_query_synthesizer
3           3                   who is naren in xformers librar?  ...  Naren, S. is one of the contributors to the xf...  single_hop_specifc_query_synthesizer
4           4  How does the architecture of TinyLlama, an inf...  ...  TinyLlama, an inference-optimal language model...  multi_hop_abstract_query_synthesizer

[5 rows x 5 columns]
EvaluationDataset(features=['user_input', 'reference_contexts', 'reference'], len=12)
Running Query Engine:   0%|                                                                                                                                            | 0/12 [00:00<?, ?it/s]/home/conda/envs/llm_testzx/lib/python3.10/site-packages/llama_index/llms/langchain/base.py:106: LangChainDeprecationWarning: The method `BaseLLM.predict` was deprecated in langchain-core 0.1.7 and will be removed in 1.0. Use :meth:`~invoke` instead.
  output_str = self._llm.predict(prompt, **kwargs)
Running Query Engine: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [02:50<00:00, 14.19s/it]
Evaluating:  85%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                    | 41/48 [02:56<01:54, 16.29s/it]Exception raised in Job[4]: TimeoutError()
Evaluating:  92%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎           | 44/48 [03:07<00:32,  8.06s/it]Exception raised in Job[16]: TimeoutError()
Evaluating:  96%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏     | 46/48 [03:17<00:12,  6.32s/it]Exception raised in Job[20]: TimeoutError()
Evaluating: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 48/48 [03:45<00:00,  4.70s/it]
{'faithfulness': 0.8800, 'answer_relevancy': 0.9157, 'context_precision': 0.9583, 'context_recall': 0.9556}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants