-
Notifications
You must be signed in to change notification settings - Fork 8
Instruction Semantics: Related Work
Sandeep Dasgupta edited this page Jul 8, 2018
·
9 revisions
Tool | Instruction Support Status |
---|---|
Total Att instructions | 3868 |
Total Att/Intel Opcodes | 1281/982 |
Comparision A Vs B | A | B | A-B | B-A | A & B |
---|---|---|---|---|---|
A: Strata B: McSema | 393 | 498 | 173 | 278 | 220 |
A: Strata Vector Imms B: McSema | 76 | 498 | 43 | 465 | 33 |
A: ACL2 B: Strata | 191 | 393 | 67 | 269 | 124 |
A: ACL2 B: McSema | 191 | 498 | 29 | 336 | 162 |
A: Strata/Stoke Unsupported B: McSema | 107 | 498 | 61 | 452 | 46 |
-
- They defined two languages, MinX and MinC, roughly corresponding to simplified subsets of x86 and C. He then defined a semantics-preserving compilation relation between MinC and MinX, and uses this as a way to determine correctness of a certain machine-code type inference algorithm.
-
- This paper is more about correct decompilation from binary to C. The idea is to do byte equivalence of binary and the recompiled binary from candidate C codes. The cndidates are chozen from a big corpus and applied with guided slource to source mutation before recompilation.
-
Angr (using VEX as the IR for analysis)
-
- https://x86proved.codeplex.com/
- Extend the model to cover more features of x86 e.g. instructions, floating point, 64-bit, SIMD, paging, multi-core
- Improve the automation of proofs
- Languages
-
https://alastairreid.github.io/specification_languages/: Last paragraph
-
Testing Intermediate Representations for Binary Analysis
- Differentially test Valgrind, BAP and BINSEC.
- Convert the semantics in individual IR to UIR and then to Z3 formulas to run Z3 on them.
- Does not condifer floating pount instructions.
- Test one instruction at a time.
- SKip testing an instruction whenever the convertion to UIR is difficult.
-
ANGR using VEX
- IR details: https://docs.angr.io/docs/ir.html
- The speed of angr as an analysis tool or emulator is greatly handicapped by the fact that it is written in python (https://docs.angr.io/docs/speed.html)
- Angr (using VEX) has an excellent instruction Support. We had the vex representation of all the register variants from angr project at here
- Useful slides
- python implementation of libraries: https://github.com/angr/angr/tree/master/angr/procedures/libc
-
BAP
-
BINSEC
-
QEMU
-
MCSEMA
-
- The IR is called ESIL.
- Not easy to read. (https://monosource.gitbooks.io/radare2-explorations/content/tut3/tut3_-_esil.html)
- Can interpret ESIL (https://radare.gitbooks.io/radare2book/content/disassembling/esil.html)
- Lacks FP support.
-
Completeness Check: Out of 1083 register instructions, 664 are not supported!
cd binary-decompilation/x86-semantics/docs/instruction_manuals/instruction_instances/Registers ~/x86-semantics/scripts/process_spec.pl --radare2_support --file ~/x86-semantics/docs/relatedwork/strata/Registers/register_instructions.txt |& tee ../../../relatedwork/radare2/register_instr_support.txt
-
- A specification language for instructions decoders and semantic translators a-l & m-z.
-
Support stat
- 440 instructions
-
- They developed a language (Transformer specification Language) for describing the semantics of an instruction set, along with a run-time system to support the static analysis of executable written in that instruction set.
- A subset of I32.
-
Btscope
- http ://bitblaze.cs.berkeley.edu/papers/bitscope_tr_2007.pdf
The translation from an x86 instruction to our IR is designed to model the semantics of the original x86 instruction, including all implicit side effects, register addressing modes, and other issues. We perform all symbolic execution on IR statements