Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cycle detection removes same edge multiple times #8657

Open
1 task done
Willenbrink opened this issue Dec 18, 2024 · 0 comments
Open
1 task done

Cycle detection removes same edge multiple times #8657

Willenbrink opened this issue Dec 18, 2024 · 0 comments
Labels
P1 High priority, add to the next sprint

Comments

@Willenbrink
Copy link

Describe the bug
_break_supported_cycles_in_graph might remove same edge multiple times.

Error message
'NoneType' object has no attribute 'keys' in https://github.com/deepset-ai/haystack/blob/ea3602643aa52c27f3bea7bf5bc90b97f568dcdc/haystack/core/pipeline/base.py#L1218C1-L1218C94

Expected behavior
If one edge is part of two cycles, I expect the algorithm to only break the edge once. After checking the second cycle, it shouldn't attempt to break the edge.

Additional context
I believe there are two other bugs occurring in my project, possibly related:

  • A cycle is deleted and not executed at all. The pipeline terminates too early as a result. I haven't been able to determine whether this really is a bug or what the cause is.
  • pipeline.draw() does not show user-provided value to variadic input #8656 This maybe messes with the topological sort of the graph though I'm not sure if that would affect the cycle detection.

As the cycle handling seems quite complicated, I'm wondering why Haystack even does that. Why is the pipeline not based on a queue of components that have all their inputs, executing them one at a time and adding their connected components once they've got their inputs. Something like:

for component in ready_comps:
  component.run()
  for connected_comp in component.connected_components:
    if connected_comp.has_all_inputs():
      ready_comps.append(connected_comp)

To Reproduce
Probably:

  • Have one edge as part of two cycles.
  • Run the pipeline
  • Observe the edge being removed first
  • Observe .get_edge_data failing for the second cycle as the edge no longer exists.

I can reliably reproduce the issue in my project.

FAQ Check

System:

  • OS: Linux
  • Haystack version (commit or version number): 2.7.0
@julian-risch julian-risch added the P1 High priority, add to the next sprint label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High priority, add to the next sprint
Projects
None yet
Development

No branches or pull requests

2 participants