Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There's not actually any practical reason for the length field of a StaticVec to be of type usize #28

Open
slightlyoutofphase opened this issue Dec 3, 2019 · 7 comments

Comments

@slightlyoutofphase
Copy link
Owner

slightlyoutofphase commented Dec 3, 2019

It's just kinda unavoidable due to nothing else being accepted as an array index in Rust (despite the fact that I'm quite sure it's not even possible to instantiate a static array with a length anywhere close to where you'd need usize indices, because you'd overflow the stack way before then.)

u32 would make significantly more sense IMO (at least on x64) and probably improve performance in places due to the layout optimizations I believe it would allow for. It's unclear what an ergonomic way to do it is, though.

Any suggestions / opinions would be very welcome!

@slightlyoutofphase
Copy link
Owner Author

Turns out this probably doesn't matter optimization-wise as much as I thought it did, if at all.

@Evrey
Copy link

Evrey commented Mar 15, 2020

I propose reöpening this issue, as dependent typing can be used to pick the most fitting type. This would e.g. allow for 255 bytes strings to take up exactly 256 bytes, no padding. There are cases where such compact layouts are desirable. In addition to that, in most ABIs and calling conventions, structs that are up to 2 registers in size get passed via registers. So cutting down the size of the length field benefits tiny containers a lot.

#![feature(const_generics)]

trait T { type I; }
struct X<const N: usize>(std::marker::PhantomData<[u8; N]>);

impl T for X<1> { type I = u8;  }
impl T for X<2> { type I = u16; }

fn main() {
    println!("{}B, {}B",
        std::mem::size_of::<<X::<1> as T>::I>(),
        std::mem::size_of::<<X::<2> as T>::I>(),
    );
}

@slightlyoutofphase
Copy link
Owner Author

I'll probably try to revisit making this work ergonomically somehow once const generics functionality gets a little bit more powerful than it currently is.

@slightlyoutofphase
Copy link
Owner Author

slightlyoutofphase commented Mar 11, 2021

Figured I might as well go ahead and re-open this as previously suggested just for the sake of making sure I don't forget about it, as it is for sure something I want to address whenever a reasonable way of doing so becomes apparent.

It's definitely one use case / situaton where Rust's lack of something resembling C++ decltype functionality (and lack of support for functions that can generically return types in the compile-time context of actually specifying what type a concrete struct field should be, based on constant input) is a bit of a "papercut" IMO.

For example, if staticvec was a C++ library, I certainly would have written the basic skeleton of it very roughly as follows from day one:

#include <array>
#include <cstddef>
#include <cstdint>
#include <limits>
#include <type_traits>

template <typename T>
constexpr auto max = std::numeric_limits<T>::max;

template <const size_t N>
consteval auto get_type() noexcept {
  if constexpr (N >= 0 && N <= max<uint8_t>()) {
    return static_cast<uint8_t>(N);
  } else if constexpr (N > max<uint8_t>() && N <= max<uint16_t>()) {
    return static_cast<uint16_t>(N);
  } else if constexpr (N > max<uint16_t>() && N <= max<uint32_t>()) {
    return static_cast<uint32_t>(N);
  } else if constexpr (N > max<uint32_t>() && N <= max<uint64_t>()) {
    return static_cast<uint64_t>(N);
  } else {
    // not really possible to get here, but it seemed like the best way to write the `else`.
    return static_cast<uint64_t>(N);
  }
}

template <typename T, const size_t N>
struct StaticVec {
private:
  using IndexT = decltype(get_type<N>());
  using ArrayT = std::array<T, N>;
  // this would be for `filled_with`
  using Initializer = auto (*)() -> T;
  // this would be for `filled_with_by_index`
  using IndexedInitializer = auto (*)(const IndexT) -> T;

  static constexpr IndexT CAPACITY = get_type<N>();

  ArrayT data{};
  IndexT length = 0;
};

@slightlyoutofphase
Copy link
Owner Author

@Evrey

It occurred to me that perhaps I might not have fully grasped what you were getting at with your original "dependent typing" snippet. Is there a form of what you gave an example of that doesn't rely on E.G. macro impls for a hardcoded range of numbers, but rather "just works" without breaking anything the crate currently does? I'd be totally fine with something that required me to do casts to usize from whatever type internally, but anything that makes the user-facing aspect of things any different from how it is now is a "deal breaker", I'd say.

@Evrey
Copy link

Evrey commented Mar 28, 2021

@slightlyoutofphase

It's easy, really. I ask for no user-visible API change, except for the return types of capacity and length functions. That code snippet of mine just demonstrated that it is possible right now to get a different integer type depending on some integer constant. So in summary for users the only change is fn len(&self) -> usize to fn len(&self) -> Self::Len or whatever the names may be. And that type is no wrapper, but just an alias for u8 to u128 or whatever.

However, fiddling around a bit more with it, it turns out that the current const generics are still not powerful enough to make that Self::Len or X<N>::I or whatever easy to implement. In theory you'd just do a match or some next_power_of_two magic or so to turn the capacity of a StaticVec into a byte size of the smallest possible integer type needed to store the length.

Sooo… this whole thing may have to wait for full const generics. Unless we want a macro expansion depending on, say, 4.3 billion numbers.

@slightlyoutofphase
Copy link
Owner Author

slightlyoutofphase commented Mar 28, 2021

However, fiddling around a bit more with it, it turns out that the current const generics are still not powerful enough to make that Self::Len or X<N>::I or whatever easy to implement. In theory you'd just do a match or some next_power_of_two magic or so to turn the capacity of a StaticVec into a byte size of the smallest possible integer type needed to store the length.

Sooo… this whole thing may have to wait for full const generics. Unless we want a macro expansion depending on, say, 4.3 billion numbers.

Yeah, this is basically what I also thought was (unfortunately) currently the case. I just wanted to make sure I hadn't missed anything about what you were actually trying to say. Thanks for the input, again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants