Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP illumos port #12

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

WIP illumos port #12

wants to merge 5 commits into from

Conversation

djdv
Copy link

@djdv djdv commented Oct 19, 2020

I needed this to bring ipfs from 0.5.0-dev-fc8307fe6 up to the current master.
This patch constructs (and shares) a BSD route socket (https://illumos.org/man/7P/route). Relaying requests/responses between go-netroute and the routing system.

I've been running this for a few days now without any issues but there's still things to do
Specifically:

  • there's no reason this should be constrained to amd64 (but I don't have any SPARC machines to test on)
    • need to move LittleEndian references to a nativeEndian pattern
  • need to figure out what's expected of RouteWithSrc and how this translates into the route protocol
  • I need to figure out what to parse from the response to set all the proper interface flags (loopback, broadcast, etc.)
    • it detects ones I expect for the interface I'm testing but the process may not be correct (see decodeGetMessage and the note above parseIP)
  • need to implement a way to ensure the socket instance gets closed when all instances returned from New fall out of scope; right now it's a singleton in the pkg scope that stays open after the first call to New
  • remove test cmd and other debugging lint (lol print statments)
  • needs a real commit message
  • areas where reflect and unsafe are used need to be double checked
  • some areas still need to use unsafe/reflect to avoid allocations and/or simplify code that's operating on specified memory layouts
  • and maybe other things

If anyone ends up looking this over now though I'd appreciate input on it.
I can run whatever tests and explain anything ambiguous if need be.

Otherwise I'm just going to push updates as they happen and flag someone for review eventually (when I'm more confident everything is working as expected).

Edit:
Footage of it alive for what it's worth: https://www.youtube.com/watch?v=Ye18nFYPyFc

@willscott
Copy link
Contributor

cool!

The other implementations have been able to limit complexity by loading the full routing table in the New calls so they don't need to maintain a connection for querying routes on each Route call. That means we haven't had to think about needing to Close the object so far.

I need to think a little bit if the semantics here are going to trip people up downstream: other implementations expect the object returned from New works like a static snapshot, and they need to make a new one to learn if the routing table has changed. An object in this implementation will give the current routes on each request.
In practice, that's probably fine. I think it's probably setting the expectation that 'Routes returned MAY be cached from as early as the creation of the object'.

For routeWithSource you could try and look up gateway for the interface / source IP and then do the route process pinned to that gateway when found. The semantics of that are 'best effort', so your current approach of ignoring it completely is also valid.

@djdv
Copy link
Author

djdv commented Oct 20, 2020

Thanks for the quick response!

other implementations expect the object returned from New works like a static snapshot, and they need to make a new one to learn if the routing table has changed.

Ahh.
Looking at the netroute_bsd.go implementation it seems like I might be able to mimic that pattern with the lif pkg.
This should allow me to take advantage of the things already provided by common.go
Converting the lif Addr and Link structures into something to fit in the common router struct (referencing this:

rtr := &router{}
rtr.ifaces = make(map[int]net.Interface)
rtr.addrs = make(map[int]ipAddrs)

)

I'm going to look into this more later since I imagine that will provide the most consistent behavior for the expectations of this package.

For routeWithSource you could try and look up gateway for the interface / source IP and then do the route process pinned to that gateway when found. The semantics of that are 'best effort', so your current approach of ignoring it completely is also valid.

Neat. That makes sense.
I can imagine what the implementation would look like, so if I decide to stick with route I'll write something up.

That means we haven't had to think about needing to Close the object so far.

The runtime should be able to set a finalizer that handles this implicitly, so that the socket is closed and the message loop is canceled when all reference to the pkg router fall out of scope, but I didn't implement that correctly here/yet.
(I got tripped up because a pointer to the pkg variable never "finalizes" since it never falls out of scope. There's easy ways around this but they seem kind of gross. Either storing the reference count in the pkg scope too or embedding the pkg variable in a unique struct created inside of / returned from New.)

Another alternative is to give each instance returned from New its own socket and message loop/channel but this seemed unnecessary. If I stick with this method I'll have to think about this a little bit more.

@djdv
Copy link
Author

djdv commented Feb 21, 2021

TODO:
pkg unix needs patches:

  • syscall.RtMsghdr and unix.RtMsghdr are not the same size (76, 74) on illumos.
    • hack: 7b930c8
    • Pkg unix reports SizeofRtMsghdr with value 0x4c(76), but is of size 74.
      (I think this is causing the first 2 bytes of the addr to be treated as part of the header, and the last 2 bytes to be discarded -
      Or something equally as bad)
  • libp2p/go-reuseport needs SO_REUSEPORT, and SO_REUSEADDR as well. I've been patching in 0x2004 and 0x0004 during build for years now. Needs to be generated in unix pkg instead. See if the tests work on platform too.

switch from using a single persistent connection to using a socket
per-request. Mainly to prevent inspecting messages that have no
relevance to our process when there are no active requests. And to
prevent wasted memory/processing maintaining our own message queue.

*hasty re-write, but it seems much better
no real benchmarks, but running on 2 systems for 48 hours shows
no errors (excluding "no-route" errors), ~2% cpu reduction, and a
faster ipfs bootstrap.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants