Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple traversal modes for UX and efficiency #518

Open
rvagg opened this issue May 5, 2023 · 0 comments
Open

Simple traversal modes for UX and efficiency #518

rvagg opened this issue May 5, 2023 · 0 comments

Comments

@rvagg
Copy link
Member

rvagg commented May 5, 2023

A long-standing discussion is regarding the over-complexity of the traversal and selector paradigm for common cases. There are also serious efficiency concerns with complex or very deep DAGs.

Properties of common cases where complexity gets in the way:

  • Walk this DAG exhaustively, give me all the blocks in it (i.e. go-merkledag/Walk() replacement). Very common, but complex to set up with our current structure, complex to run, you have to think about LinksVisitOnlyOnce, what selector is right, etc. And yet it's also inefficient due to the accumulation of internal state.
  • I just care about what blocks are in this DAG according to the selector I provide - the most common kind of visitor function is return nil (I'm not sure I can point to a real-world, production use traversal with a visitor that isn't this tbh).
  • Run some non-trivial selector up to some point and then do an exhaustive walk (this is the Lassie / "verified CAR" case - run a UnixFS path, then give me everything under that path).
  • Selectors that are simple enough to not require much, or any "state", yet we accumulate and account, wasting resources.

IPNI is an example that has come up recently, synchronising advertisement chains that are strictly simple linked lists. We have efficiency problems. Primarily the accumulation of PathSegment (allocated on each new node) and yet the selector is either an explore-all or explore-all with recursion limit and the visitor function is a noop. If we go fishing we'd probably find other state that's wasteful in this mode. Selector interests/attention are very simple for these kinds of selectors and visitor functions being a noop mean that properties we've carefully curated in Progress are mostly useless.


Some things that I've been considering:

  • UX: there's a generalised puzzle with traversals and LinkSystems when the user is approaching the APIs thinking (and only caring) about blocks and not nodes. The entire API is node focused, but so many (likely the majority) of use-cases are block focused.
  • UX: a simple go-merkledag Walk() replacement that's block focused, explore-all, no node visitor, LinksVisitOnlyOnce. I'm not sure what this looks like given the LinkSystem model, maybe there's still some wiring a user needs to provide. Alternatively maybe it just takes a callback on block load. Making the case for doing away with go-merkledag is a bit harder without a very simple replacement for this one utility.
  • go-merkledag has a concurrent Walk() that doesn't care about order. We could use the preloader to provide something that's still nicely ordered but also affords some preloading powers. Or, we could throw order out the window and munge preloader and loader ordering into one bucket. Interesting issues to explore here.
  • Selector self-awarness: you should be able to ask a selector some key questions. Mainly I've been thinking "are you exhaustive?" which we can use to switch modes—if a selector is exhaustive then we can avoid some state maintenance. If you run a selector and at some point it flips to being exhaustive then maybe you could switch to an entirely different mode of running it from there down. Re the IPNI issue, maybe you could ask "do you care about PathSegment?" (probably some generic form of this question about the kind of state the selector cares about). The nice thing about selectors as DAGs is that their subtrees can have stable properties - if you can know a selector is exhaustive then it's not going to become non-exhaustive from that point recursively down.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🥞 Todo
Development

No branches or pull requests

1 participant