Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override uint64_t generated type for bit types #47

Open
cmilhaupt opened this issue Oct 24, 2021 · 2 comments
Open

Override uint64_t generated type for bit types #47

cmilhaupt opened this issue Oct 24, 2021 · 2 comments

Comments

@cmilhaupt
Copy link

cmilhaupt commented Oct 24, 2021

Given the following KSY definition:

meta:
    id: bits_example
    endian: be
    bit-endian: be
seq:
- id: first_bit
  type: b2
- id: second_bit
  type: b2
- id: third_bit
  type: b2
- id: fourth_bit
  type: b2

The following C++ header is generated:

# kaitai-struct-compiler -t cpp_stl --cpp-standard 11 bit_example.ksy
# cat bits_example.h | grep uint64
    uint64_t m_first_bit;
    uint64_t m_second_bit;
    uint64_t m_third_bit;
    uint64_t m_fourth_bit;
    uint64_t first_bit() const { return m_first_bit; }
    uint64_t second_bit() const { return m_second_bit; }
    uint64_t third_bit() const { return m_third_bit; }
    uint64_t fourth_bit() const { return m_fourth_bit; }

Is there anyway in the KSY file definition to make these generate as a uint8_t to save some space? I tried type: b2.as<u1> seeing something similar for arrays, but this gives me the following error:

bit_example.ksy: /seq/0: error: parsing expression 'b2.as<[]u1>' failed on 1:3, expected "::" | CharsWhile(Set( , n)) | "\\\n" | End

Any help or clarifications would be appreciated.

@generalmimon
Copy link
Member

@cmilhaupt:

Is there anyway in the KSY file definition to make these generate as a uint8_t to save some space?

You have roughly the following options:

  1. Instead of using bX types, parse a u1 int and then unpack it with value instances manually - but you'll have to cast the values like .as<u1> again, because the compiler will by default assign type int32_t to any instance with a non-trivial (i.e. other than identity) integer expression:

    seq:
      - id: packed
        type: u1
    instances:
      a:
        value: ((packed & 0b1100_0000) >> 6).as<u1>
      b:
        value: ((packed & 0b0011_0000) >> 4).as<u1>
      c:
        value: ((packed & 0b0000_1100) >> 2).as<u1>
      d:
        value: ((packed & 0b0000_0011) >> 0).as<u1>
  2. Use the bX types, fork the compiler and adapt its behavior for your needs.

    I understand that it may sound intimidating at first, but it is by far the most elegant option. It should be easy in this case: you change the target type (CppCompiler.scala:1108 - obviously, you should choose the smallest type from uint{8,16,32,64}_t which it can fit into according to the width attribute - see the definition of BitsType) and type cast the read_bits_int_*() call (which returns uint64_t - see kaitai_struct_cpp_stl_runtime / kaitai/kaitaistream.h:151) to the target type (i.e. change the line CppCompiler.scala:773) - here is an example line doing just that for inspiration. Changing these two lines should be enough I think, then you build the modified compiler from source (which is made really straightforward with the sbt tool; you simply download it and run these commands to build it for the JVM/JavaScript environment respectively).

  3. Compile the spec using the bX types normally and then patch it manually, or write a script that loops over regex matches and does the modifications for you. This may work, but it is quite error-prone and chances are you'll end up with an invalid C++ code. The previous option is definitely more reliable and almost certainly easier if you ask me.

@cmilhaupt
Copy link
Author

@generalmimon thanks for the quick reply! Option 2 is certainly most elegant. I've never worked with Scala before, but I'll take a stab at it when I find some time. Thanks for linking to the lines that would need to change and for the example as well.

Follow-up question: would that approach adapt to arrays as well? I.e. if I have

- id: ex_arr
  type: b2
  repeat: expr
  repeat-expr: 4

would a std::vector<uint8_t> be generated automatically or would I need to adapt CppCompiler.scala elsewhere? Also could this fix be merged into the mainline? Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants