-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endianness #4
Comments
I would personally vote for always writing the data in little-endian (it is by far the most common I believe) and it would be one degree of freedom less to worry about. I could add that information next to the memory layout in the general section. |
Big-endian systems are indeed quite rare these days. But it's still requisite explicit information for knowledge of binary data storage. E.g. FreeSurfer's |
But that could be a software implementation details. The file format could be in little-endian 100% of the time and it is the task of the library to write it as such (no matter your native endianness) rather than adding a layer of complexity, allowing both and leaving it to the library to read both endianness. I am not an expert, but what is the downside of forcing little-endian? Does it limit compatibility, is byte-swap rare/hard is some language? I think this is an implementation problem. Library implementation with writer/reader could handle this "specification". I can write it clearly in the specification that it is always little-endian, no exception. That way it will be the role of the implementation to handle it. (I really don't like adding an extra extension/degree of freedom for something that could be handled internally by the library). Right now, it's only the two of us. I have never encountered a problem with that so my intuition is likely limited. Maybe input from other people could help or a real-life limitation you have experienced could help me see why both could be allowed. Obviously, if one or the other is chosen It will be written clearly at the top of the specification. |
Yep: I'm not actually arguing to the contrary. I'm only arguing that:
@jdtournier has PTSD from having to deal with this issue back in the early 2000's due to having regular access to systems of both endianness. It's less of an issue now, but for a specification it needs to be 100% clear what the expectations are. For us we'd have no issue with endianness being specified on a per-file basis because MRtrix3 already has the requisite back-end capabilities, we already support images stored with either endianness with conversion functors inserted at compile-time and indeed already support such for streamlines data, so might advocate slightly in that direction since it's the more general case. But I can see that others would have a preference for forcing little-endian, since it would make their lives easier to just ignore the issue and have their software not be capable of running on a big-endian system. I'd also imagine that for any language interpreted at run-time, having an endianness code branch on read/write of every vertex has the potential to severely slow execution. |
+1 for managing little and big-endian. Indeed, it is rare, But, to complete the @Lestropie argument, it is required for some weird Linux distribution, Solaris system, really old macOs (and your never know when they decide to change...) I think, it is important. |
@skoudoro like we said support would be an implementation details rather than a specification. If the file format is LE but the future hypothetical library handle both system to respect LE standard, that's fine? Or do you meant we should allow both endianness in the file format? |
The principle data are stored in pure binary, but there is no indication in either the specification or in the file naming convention as to the endianness of the data. Either:
The specification needs to state unambiguously that all data are stored as little-endian, and hence software would need to self-detect if it is running on a big-endian system and always reorder bytes;
The file names need to additionally specify the endianness for data types > 1 byte (e.g. in MRtrix3 we would say something like "
UInt32LE
"), and software would need to detect a mismatch betwee the endianness of the data and the endianness of the system and decide whether or not byte-swapping is necessary.The text was updated successfully, but these errors were encountered: