Berry bytecode compression #195
Replies: 10 comments 4 replies
-
If we use variable-length bytecodes, VM changes will be very large, and performance is also a risk point. In addition, register-based VM instructions generally require more parameters, which are different from stack-based VM, so the benefits of using variable-length instructions may be limited. As an example: ; addition of register-based VM
ADD R0 R1 R2
; addition of stack-based VM
PUSH value1
PUSH value 2
ADD ; pop two operand values and push one result value So I think Huffman compression is a better choice. |
Beta Was this translation helpful? Give feedback.
-
Thanks,(when I have time) I will gather metrics of my codebyte base to get typical frequencies of values to feed in Huffman compressor. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
You mean, we can design efficient compression algorithms for bytecode format? If this is feasible, I think bytecode files should also be considered. |
Beta Was this translation helpful? Give feedback.
-
We can even save RAM through compression. For some ultra-low-end devices, some performance may be sacrificed, but at least it is possible to support Berry. |
Beta Was this translation helpful? Give feedback.
-
I was thinking of a just-in-time decompressor of bytecode. Before running bytecode, the function would be decompressed in RAM and this 'normal' copy of the code would be used by the VM. To avoid a too high performance impact, we could keep a cache of recent or often used code in memory. This means a tradeoff of using less Flash but more memory. |
Beta Was this translation helpful? Give feedback.
-
I have analyzed the distributions and here are the optimized patterns for RegA, RegB and RegC: Each code has
|
Beta Was this translation helpful? Give feedback.
-
Here is the code I propose for opcodes, the escape sequence is
Running this over 4300 instructions gives -25% compression on the opcode alone. I will run tests for A/KB/KC later |
Beta Was this translation helpful? Give feedback.
-
Here is the final code for KC:
Compression ratio is -41% |
Beta Was this translation helpful? Give feedback.
-
Wrapping all numbers above, I finally get
If we count in bytes, that makes 10034 bytes vs 17404 bytes, of -42%. Of course this should be a little worse because of the rounding to next byte within each function. Now I need to check the size of the decompressor code, so that the benefits are still there. |
Beta Was this translation helpful? Give feedback.
-
This has been on my mind for some time...
Berry's bytecode code density is quite low, with each instruction encoded on 4 bytes. I would like to explore ways to reduce solidified code size. I tried to define a shorter instruction set in 2 or 3 bytes, but this would be too complex for the VM to handle it.
Next option would be to apply Huffman-like compression on bytecode, and decompress in RAM before running. We can keep around a small cache of uncompressed bytecode for code ran often.
Obviously I'm still missing metrics about potential savings.
Any thoughts?
Beta Was this translation helpful? Give feedback.
All reactions