Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: n_contexts_per_doc must be a positive integer. #1199

Open
tenchanhe opened this issue Nov 29, 2024 · 0 comments
Open

AssertionError: n_contexts_per_doc must be a positive integer. #1199

tenchanhe opened this issue Nov 29, 2024 · 0 comments

Comments

@tenchanhe
Copy link

Discord
This question already ask in discord. Thank you.

Describe the bug
I try to use Synthesizer to create dataset, but facing problem like this.

/tmp2/chten/miniforge3/envs/delta/lib/python3.12/site-packages/deepeval/__init__.py:53: UserWarning: You are using deepeval version 1.5.9, however version 2.0 is available. You should consider upgrading via the "pip install --upgrade deepeval" command.
  warnings.warn(
✨ 🚀 ✨ Loading Documents: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 11.70it/s]
✨ 🧩 ✨ Generating Contexts: 0it [00:00, ?it/s]Traceback (most recent call last):
  File "/tmp2/chten/delta/DeepEval/synthesizer/test.py", line 19, in <module>
    synthesizer.generate_goldens_from_docs(
  File "/tmp2/chten/miniforge3/envs/delta/lib/python3.12/site-packages/deepeval/synthesizer/synthesizer.py", line 122, in generate_goldens_from_docs
    goldens = loop.run_until_complete(
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp2/chten/miniforge3/envs/delta/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/tmp2/chten/miniforge3/envs/delta/lib/python3.12/site-packages/deepeval/synthesizer/synthesizer.py", line 202, in a_generate_goldens_from_docs
    await self.context_generator.a_generate_contexts(
  File "/tmp2/chten/miniforge3/envs/delta/lib/python3.12/site-packages/deepeval/synthesizer/chunking/context_generator.py", line 194, in a_generate_contexts
    results = await asyncio.gather(*tasks)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp2/chten/miniforge3/envs/delta/lib/python3.12/site-packages/deepeval/synthesizer/chunking/context_generator.py", line 217, in _a_process_document_async
    await self._a_get_n_random_contexts_per_doc(
  File "/tmp2/chten/miniforge3/envs/delta/lib/python3.12/site-packages/deepeval/synthesizer/chunking/context_generator.py", line 295, in _a_get_n_random_contexts_per_doc
    n_contexts_per_doc > 0
AssertionError: n_contexts_per_doc must be a positive integer.
✨ 🧩 ✨ Generating Contexts: 0it [00:01, ?it/s]

To Reproduce
This is the code I run, and the leed20.docx is a doc about 51 pages.
I already try for setting chunk size to 500, 200, 64 and 15, but all these chunk size I try also get the same error.
The chroma version is v0.5.3.

from deepeval.synthesizer import Synthesizer
from deepeval.dataset import EvaluationDataset
from deepeval.synthesizer.config import ContextConstructionConfig

dataset = EvaluationDataset()
synthesizer = Synthesizer()
context_config=ContextConstructionConfig(
    # embedder=None,
    # critic_model=None,
    max_contexts_per_document=1,
    chunk_size=1000,
    chunk_overlap=0,
    context_quality_threshold=0.5,
    context_similarity_threshold=0.5,
    max_retries=1)

synthesizer.generate_goldens_from_docs(
    document_paths=['leed20.docx'],
    include_expected_output=True,
    max_goldens_per_context=1,
    context_construction_config=context_config
)
print(synthesizer.synthetic_goldens)
dataframe = synthesizer.to_pandas()
print(dataframe)

Desktop:

  • Ubuntu 22.04.5 LTS (GNU/Linux 5.15.0-126-generic x86_64)

LLM setting

  • deepeval set-local-model --model-name="llama3.2:3b" --base-url="http://localhost:11434/" --api-key="ollama"
  • deepeval set-local-embeddings --model-name="nomic-embed-text:latest" --base-url="http://localhost:11434/" --api-key="ollama"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant