-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle unparsable multiaddrs #70
Comments
A few things I'd like to explore:
|
@jacobheun it might be useful to have your opinion on this? |
The requirement that a multiaddr can be represented as a buffer is tricky. Every user of multiaddr has a stale copy of the buffer-to-protocol mapping. It's trivial to syntacitically validate a multiaddr string, but hard to semantically validate that protocols are valid if the definition of valid is their existance in the mapping table. We will keep running into situations where peers and apps and api clients are running with different versions of multiaddr, with different mapping tables, and older ones unable to parse newer ones. Having a string only format for multiaddr that doesn't guarantee it can be converted to a buffer would at least tick the points 1 to 4 on the problem statement https://github.com/multiformats/multiaddr#introduction |
I think we're talking past each other a bit here. My primary concern is relay dialing (which should have been in the issue description...). That is, I need to be able to use a multiaddr prefix (encoded) without the rest. For example: QmRelay and QmTarget may understand special_protocol while I may not. Basically, I'm worried about creating unnecessary network partitions because we're being too strict in parsing our inputs. We still need to solve the UX issues.
We usually don't and can't know the original string (e.g., binary peer address records from the DHT).
This proposal tries to cover the use-case where I don't really care about the rest of the multiaddr. Ideally, I'd be able to validate it, but even if I could, I'm just going to pass it off to someone else anyways.
By string-only, I meant that Now, there is the case where a user gives me
That is, if we ran across However, really, "upgrade your client" may also be a valid solution here. |
Thanks for your thoughts!
Do we need a That aside, I'm interested to see if/how this would work in code - I'll see if I can work up a PR to js-multiaddr and report back :D |
So, there wouldn't really be an unknown "codec". Basically, we're trying to handle two inverse cases:
Basically, we need a StringInBinary protocol (
The second half (binary in strings) has a PR here: multiformats/go-multiaddr#74. However, I'm still worried about multiple binary representations of multiaddrs. That's why I'm less sure we want StringInBinary. |
That way, we can always tell if something is a path or something else. We may also be able to take advantage of this later to combine a few concepts and get rid of the "multiaddrs look like paths but are totally not" problem. However, we can think about that later. This PR just reserves the code so we don't run into problems later. * Remove the distinction between string/binary multiaddrs. Instead, the "string" will *also* be a valid binary multiaddr. * Define a new multipath spec to combine multiaddrs and other paths. Related to: multiformats/multiaddr#70
That way, we can always tell if something is a path or something else. We may also be able to take advantage of this later to combine a few concepts and get rid of the "multiaddrs look like paths but are totally not" problem. However, we can think about that later. This PR just reserves the code so we don't run into problems later. * Remove the distinction between string/binary multiaddrs. Instead, the "string" will *also* be a valid binary multiaddr. * Define a new multipath spec to combine multiaddrs and other paths. Related to: multiformats/multiaddr#70
Could the problem of unknown/new address protocols be solved by the following steps?
That'd seem like a simple way forward to me. |
@lgierth: How do we know whether a protocol even requires an argument at all? /utp, /quic and about 10 others do not, for instance. |
Yeah, actually, point 2 above should include most or all existing protocols, not just the length-prefixed ones.
Backward compatibility. It'd be a huuuuge hassle to fundamentally change existing multiaddr protocols, while it's easy to freeze them as-is and introduce new rules for new protocols. We must avoid breaking changes at all cost. |
Note: We can also transition. That is, we can say introduce a new "multiaddr-2" protocol and parse everything after that using the new system. Eventually, we can remove support for multiaddr-1 (the network will forget old multiaddrs pretty quickly). This would give us a chance to revisit this from scratch. |
Has there been any more progress made on this? This would make multiaddr much more useful for projects that need to add complexity to what is already defined by multiaddr without requiring per-application changes to the table. |
Unfortunately, no. If you're interested, I'd start by forking go-multiaddr and adding an implementation there. I say "fork" because adding support for this will require a non-trivial refactor of everything depending on multiaddrs, so I expect we'll end up with two "flavors" of multiaddrs: partially validated ones and fully validated ones". |
Due to the fact that binary multiaddrs don't include protocol definitions, we can't parse them unless we know all the relevant protocols.
However, we can parse a prefix. Therefore, I'd like to propose a special, string-only "unknown" protocol that takes a single multibase encoded argument. That is: `/ip4/1.2.3.4/tcp/123/unknown/bxyz". This would only exist in the string format and would allow us to keep and use multiaddrs we don't fully understand.
This should fix some of the problems described in #6.
The text was updated successfully, but these errors were encountered: