Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IPv4/6 conversions from/to Cstructs #36

Closed
wants to merge 5 commits into from
Closed

Add IPv4/6 conversions from/to Cstructs #36

wants to merge 5 commits into from

Conversation

nojb
Copy link
Contributor

@nojb nojb commented Oct 2, 2014

Re #12

@nojb
Copy link
Contributor Author

nojb commented Oct 2, 2014

Should these functions go in a module Ipaddr_cstruct so that they can be included conditionally and not add to the dependencies of the core Ipaddr ?

Cstruct.set_uint8 cs (2 + off) ((|~) (i >! 8));
Cstruct.set_uint8 cs (3 + off) ((|~) (i >! 0))

let to_cstruct i =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would encourage not allocating cstructs in these functions, but passing in a cstruct to write into. This is because they must be allocated via Io_page in Mirage (so that the underlying page is aligned to a 4K block).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then shall I remove the function to_cstruct altogether ? The function to_cstruct_raw works on passed-in cstruct. The only difference is that to_cstruct_raw can raise an exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, removing the allocator is fine, or having an alloc optional argument that can override it. So you could have

val set_cstruct : ?off:int -> t -> Cstruct.t -> unit
val to_cstruct : ?(alloc:unit -> Cstruct.t) -> i -> Cstruct.t

The latter would let you pass in Io_page.(to_cstruct (make 1)) as the allocator function, and use Cstruct.create by default if not specified.

@dsheets
Copy link
Member

dsheets commented Oct 2, 2014

Looks pretty good. The V6 functions are scattered around a bit (maybe due to function dependencies?) and there are no cstruct/bigstring implementations for network prefixes. Also, nothing to read ASCII representations out of cstructs.

I'm a bit hesitant to put cstruct support in the main module interface because it will be another dependency of ipaddr that requires camlp4 making removing the camlp4 dependency even harder in the future. Also, ocplib-endian, ounit, and optcomp would become implicit dependencies of ipaddr. This is kind of annoying when it's not clear to me why this library needs to require ounit or optcomp.

I support the inclusion of bigarray/bigstring functions in the primary signature but I think perhaps an Ipaddr_cstruct module which is built optionally if cstruct is available may be preferable. I've not looked at the difficulty or inefficiency of implementing such a module, though. I think including cstruct support in the main interface wouldn't be a problem if cstruct had ounit and camlp4 made into optional dependencies.

Out of curiosity, is your application packet parsing? @avsm, how do your packet parsers use cstruct+ipaddr, today (or do they)?

@dsheets
Copy link
Member

dsheets commented Oct 2, 2014

Oh, also, if you could add cstruct to https://github.com/mirage/ocaml-ipaddr/blob/master/.travis-ci.sh that would placate Travis.

@avsm
Copy link
Member

avsm commented Oct 2, 2014

On 2 Oct 2014, at 18:51, David Sheets [email protected] wrote:

Looks pretty good. The V6 functions are scattered around a bit (maybe due to function dependencies?) and there are no cstruct/bigstring implementations for network prefixes. Also, nothing to read ASCII representations out of cstructs.

I'm a bit hesitant to put cstruct support in the main module interface because it will be another dependency of ipaddr that requires camlp4 making removing the camlp4 dependency even harder in the future. Also, ocplib-endian, ounit, and optcomp would become implicit dependencies of ipaddr. This is kind of annoying when it's not clear to me why this library needs to require ounit or optcomp.

Those are indeed build-time dependencies -- should be helped when we move to opam 1.2's finer grained build deps.

I support the inclusion of bigarray/bigstring functions in the primary signature but I think perhaps an Ipaddr_cstruct module which is built optionally if cstruct is available may be preferable. I've not looked at the difficulty or inefficiency of implementing such a module, though. I think including cstruct support in the main interface wouldn't be a problem if cstruct had ounit and camlp4 made into optional dependencies.

See mirage/ocaml-cstruct#35

Out of curiosity, is your application packet parsing? @avsm, how do your packet parsers use cstruct+ipaddr, today (or do they)

Nicolas is looking into adding ipv6 support to Mirage. Right now it's done manually, but could be handled via ipaddr with this pull for IPv4 as well.

-anil

@nojb
Copy link
Contributor Author

nojb commented Oct 2, 2014

Thanks @dsheets and @avsm for the comments. I have split the cstruct stuff into a different module Ipaddr_cstruct.

  • I needed to call a call a couple of no-ops: Ipaddr.V{4,6}.{of,to}_int32 because Ipaddr.V{4,6}.t is abstract (a good thing).
  • I introduced an allocator argument to V{4,6}.to_cstruct as per @avsm's suggestion. I didn't change the name to_cstruct_raw to set_cstruct to keep it in the same style as Ipaddr.to_bytes_raw, etc.
  • I duplicated the functions too_much, need_more inside Ipaddr_cstruct for error reporting. They could be put into a separate module (but I don't think it is worth the trouble). Maybe it would be better if the functions that deal with binary conversions (bytes and cstructs) just raised Invalid_argument instead of Parse_error ?

One slight downside of putting the cstruct stuff in its own module is that doing

open Ipaddr
open Ipaddr_cstruct

results in Ipaddr_cstruct.V{4,6} shadowing Ipaddr.V{4,6}.

Yes, @dsheets, application is packet parsing. Which is why I did not have much use for converting between ASCII and cstructs.

@dsheets dsheets mentioned this pull request Oct 3, 2014
@dsheets
Copy link
Member

dsheets commented Oct 3, 2014

It looks like @avsm is in the process of removing/optionalizing unnecessary cstruct dependencies so inclusion in the primary module is much more palatable (only ocplib-endian and optcomp are new with both also relying on camlp4 for build). As these dependencies are encapsulated by cstruct and small, I'm comfortable with first-class cstruct support in the primary module. I'm sorry to ask you to do more work, @nojb, could you please put the cstruct support back into the Ipaddr module? The shadowing in particular is very annoying.

I've been thinking about your agree with your suggestion to use Invalid_argument instead of Parse_error for raw buffer length errors. The intent of the checked function was to provide a ready-made way to try to parse a buffer as a binary address. If the user already knows they have an address in a buffer, they should use the _raw version without the length check. In this case, they don't suffer any length checks. As it's generally advised against to catch Invalid_argument because it represents a programming error, this checked function is then no more useful than an assertion where it functions as an assertion and a condition right now. I'm not certain about its utility at all and perhaps it should be removed or changed to parse ASCII strings so that _raw means bit-string representation and _exn means ASCII digits...

Finally, could Cstruct.BE.get_uint32 be used for IPv4 parsing rather than reading individual bytes?

Sorry for the vacillation and thanks very much for your contribution. :-)

@mor1
Copy link
Member

mor1 commented Oct 3, 2014

On 3 October 2014 09:49, David Sheets [email protected] wrote:

I've been thinking about your agree with your suggestion to use
Invalid_argument instead of Parse_error for raw buffer length errors. The
intent of the checked function was to provide a ready-made way to try to
parse a buffer as a binary address. If the user already knows they have an
address in a buffer, they should use the _raw version without the length
check. In this case, they don't suffer any length checks. As it's generally
advised against to catch Invalid_argument because it represents a
programming error, this checked function is then no more useful than an
assertion where it functions as an assertion and a condition right now. I'm
not certain about its utility at all and perhaps it should be removed
or changed to parse ASCII strings so that _raw means bit-string
representation and _exn means ASCII digits...

parsing ascii string representations is certainly a useful thing to do;
though _exn seems a non-obvious (indeed, orthogonal) modifier on the
function name to represent that. perhaps _str or _asc or something
instead?

Richard Mortier
[email protected]

@dsheets
Copy link
Member

dsheets commented Oct 3, 2014

I agree. I think my complete interface proposal for a 3.0 would be as follows:

  1. _raw are direct byte-representation functions into pre-allocated buffers
  2. _new are direct byte-representation functions into callee-allocated buffers
  3. _exn are ASCII string-representation functions that can throw
  4. `` (nothing) are optional ASCII string-representation functions

This is a significant breaking change but I think something like this would fix the string/bytes/bigstring/cstruct representation confusion. _str is also a good suffix but I wonder if we can get away with treating the human-readable representation as 'default' because it also has an _exn case for parse errors.

I'd like to fix this problem once, correctly so we never have to think about this again.

@dsheets
Copy link
Member

dsheets commented Oct 3, 2014

Oh, I forgot a fifth case for exceptional functions on ASCII reps into a specific buffer position. Perhaps this is _asc or _str.

@dsheets
Copy link
Member

dsheets commented Oct 3, 2014

The prefixes for these suffixes will some of: of_string, of_bytes, of_buffer, of_cstruct, of_bigstring. The to_ versions would probably leave off the _exn case and have the (nothing) and_str(_hum?_asc?) cases throw anInvalid_argument if supplied with a buffer that is too small.

@mor1
Copy link
Member

mor1 commented Oct 3, 2014

gah. why doesn't github format as markdown when things come in from mail. now out of order...

ok-- generally agree. (and now seeing your followup.)

just to be sure i'm clear, the prefix for all these is of_cstruct? or
both of_cstruct and to_cstruct?

and then we have the following suffixes:

  • Ipaddr.V[46].t to/from byte-representations with caller or callee
    allocated buffers
  • Ipaddr.V[46].t to/from ascii-representations that return t option or
    throw
  • Ipaddr.V[46].t to/from ascii-representations into a specific caller
    allocated buffer offset, that might throw

?

_raw / _new / nothing (or _opt?) / _exn as suggested seem good for
the first four.

could go with _asc (preferable to _str i think) for the fifth; or
_off?

@mor1
Copy link
Member

mor1 commented Oct 3, 2014

right. responding to @dsheets last comment:

not _hum. no. please no. ugh.

prefer _asc to _str but either ok.

proposal for to_ vs of_ versions sounds sensible. (at least close enough for it to be worth seeing an interface laid out i guess :))

@nojb
Copy link
Contributor Author

nojb commented Oct 3, 2014

I put the cstruct stuff back into Ipaddr. I am waiting to change the exception behaviour until there is a clear decision on the issue of error handling/naming/etc.

@dsheets
Copy link
Member

dsheets commented Oct 3, 2014

I think we should release the cstruct support at 2.6 and design a nice, consistent interface for a 3.0. The present interface is internally consistent if not universally consistent. The task of deciding the error behaviors and which functions operate on bytes vs strings and byte-reps vs ASCII-reps is beyond the scope of a minor release. I've opened #37 to track the 3.0 redesign.

@nojb
Copy link
Contributor Author

nojb commented Oct 3, 2014

Another comment on this: the available conversion functions in Macaddr do not match those in Ipaddr. For example, there is to_bytes, but no to_bytes_raw, etc.

I wanted to add the cstruct functions to Macaddr, but I am not sure which convention I should follow: for cstructs, the *_raw functions are the most commonly used ones, but then we should add {to,of}_bytes_raw as well ?

@mor1
Copy link
Member

mor1 commented Oct 3, 2014

i guess follow the convention just discussed for ipaddr.3.0?

(or do i misunderstand the question?)

On 3 October 2014 17:06, Nicolas Ojeda Bar [email protected] wrote:

Another comment on this: the available conversion functions in Macaddr do
not match those in Ipaddr. For example, there is to_bytes, but no
to_bytes_raw, etc.

I wanted to add the cstruct functions to Macaddr, but I am not sure which
convention I should follow: for cstructs, the *_raw functions are the
most commonly used ones, but then we should add {to,of}_bytes_raw as well
?

Reply to this email directly or view it on GitHub
#36 (comment).

Richard Mortier
[email protected]

@nojb
Copy link
Contributor Author

nojb commented Oct 3, 2014

Sorry for the confusing comment. I just wanted to point out that even in the current state, the interfaces of Ipaddr and Macaddr were not consistent. In any case, I just added a couple of functions to Macaddr to convert from/to Cstructs following the current convention.

@mor1
Copy link
Member

mor1 commented Oct 3, 2014

ah right. i just meant, i guess it would make sense at some point to make
them both follow the convention just discussed :)

On 3 October 2014 21:14, Nicolas Ojeda Bar [email protected] wrote:

Sorry for the confusing comment. I just wanted to point out that even in
the current state, the interfaces of Ipaddr and Macaddr were not
consistent. In any case, I just added a couple of functions to Macaddr to
convert from/to Cstructs following the current convention.

Reply to this email directly or view it on GitHub
#36 (comment).

Richard Mortier
[email protected]

nojb added 2 commits October 6, 2014 16:55
The Macaddr.t is now passed in the first position, just like in
Ipaddr.V{4,6}.to_cstruct_raw.
This way it is consistent with the `of_bytes_exn' function.
@avsm
Copy link
Member

avsm commented Nov 17, 2014

@nojb @dsheets is this ready for merging?

@dsheets
Copy link
Member

dsheets commented Nov 17, 2014

No. We want to clean up the variations first with some more modules/pre-applied functors. Do you need this feature imminently?

@avsm
Copy link
Member

avsm commented Nov 21, 2014

I'm merging mirage/mirage-tcpip#70 which could use this (But has temporary overrides in place to work without it)

@yomimono
Copy link

Any chance this will go in anytime soon? I'm working on something which will need either this or some overrides like those required for @nojb 's ipv6 patch (which are still in place).

@nojb
Copy link
Contributor Author

nojb commented Jan 27, 2015

I seem to remember that a little refactoring was necessary to make all the possible conversions fit more seamlesssly, with less code duplication and with an uniform interface (maybe in the style of ocplib-endian).

Until this is done you may be better off including the necessary functions in your code I guess :(

@samoht
Copy link
Member

samoht commented May 16, 2016

@dsheets any (more) opinion on that PR?

@dsheets
Copy link
Member

dsheets commented May 17, 2016

This needs refactoring/renaming work that will likely break backward compat of the interface. Analysis/suggestions/PRs very welcome for that.

@avsm
Copy link
Member

avsm commented May 11, 2017

It only took 2.5 years, but mirage/ocaml-cstruct#133 has significantly reduced the dependency cone of the cstruct library. So depending on it directly from ipaddr is now feasible

avsm added a commit to avsm/ocaml-ipaddr that referenced this pull request Jul 11, 2019
for conversion to/from cstructs

based on mirage#36

Co-authored-by: Nicolás Ojeda Bär <[email protected]>
@avsm
Copy link
Member

avsm commented Jul 11, 2019

It took half a decade, but fixed in #90 based on this PR

@avsm avsm closed this Jul 11, 2019
avsm added a commit to avsm/opam-repository that referenced this pull request Jul 12, 2019
…exp and macaddr-cstruct (4.0.0)

CHANGES:

* Rename the `to/from_bytes` functions to refer to `octets`
  instead.  This distinguishes the meaning of human-readable
  addresses (`string`s in this library) and byte-packed
  representations(`octet`s in this library) from the OCaml
  `bytes` type that represents mutable strings.

  Porting code should just be a matter of renaming functions
  such as:
   - `Ipaddr.of_bytes` becomes `Ipaddr.of_octets`
   - `Macaddr.to_bytes` becomes `Macaddr.to_octets`

* Introduce new `write_octets` functions that can write
  octet representations of IPv4/v6 into an existing bytestring.

* Use the `domain-name` library to produce domain names
  from IP addresses.

* Remove the `ipaddr.sexp` and `macaddr.sexp` ocamlfind
  subpackages and instead have `ipaddr-sexp` and `macaddr-sexp`
  to match the opam package names.

* Add new `Ipaddr_cstruct` and `Macaddr_cstruct` libraries
  for conversion to/from cstructs (mirage/ocaml-ipaddr#36 @nojb @avsm)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants