Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Machine-readable logs #75

Open
evverx opened this issue May 7, 2022 · 7 comments
Open

Machine-readable logs #75

evverx opened this issue May 7, 2022 · 7 comments

Comments

@evverx
Copy link
Member

evverx commented May 7, 2022

In its current form logs are supposed to look like https://github.com/matusmarhefka/dfuzzer/pull/4 to make reprogen.py work as far as I understand but it would probably make sense to revisit the format to make it easier to parse logs in general. Those logs could help to look for example for timeouts that are ignored by dfuzzer by default.

@mrc0mmand
Copy link
Member

One of the major flaws of the current (CSV) format is that the separator (;) can appear in the randomly generated strings, making machine-parsing of the log file harder or sometimes almost impossible.

evverx referenced this issue in evverx/dfuzzer May 7, 2022
to make scripts like that work https://github.com/matusmarhefka/dfuzzer/issues/75
should be addressed first. Until then it doesn't seem to
make much sense to keep the script in the repository.

dfuzzer can always be pinned to v1.5 to bring
the script back.
@evverx
Copy link
Member Author

evverx commented May 7, 2022

Those logs could help to look for example for timeouts

Looks like timeouts have never been logged by dfuzzer :-(

evverx referenced this issue May 7, 2022
to make scripts like that work https://github.com/matusmarhefka/dfuzzer/issues/75
should be addressed first. Until then it doesn't seem to
make much sense to keep the script in the repository.

dfuzzer can always be pinned to v1.5 to bring
the script back.
@mrc0mmand
Copy link
Member

mrc0mmand commented May 8, 2022

As for the random strings, I guess one possible fix would be to process the strings via https://docs.gtk.org/glib/func.strescape.html before printing them out. This might also help with #80, since strings could be wrapped in " and identified by that. As the documentation suggests, this operation could be easily reversed by https://docs.gtk.org/glib/func.strcompress.html, and the escape sequences should be compatible with bash as well:

Escapes the special characters '\b', '\f', '\n', '\r', '\t', '\v', '' and '"' in the string source by inserting a '\' before them. Additionally all characters in the range 0x01-0x1F (everything below SPACE) and in the range 0x7F-0xFF (all non-ASCII chars) are replaced with a '\' followed by their octal representation. Characters supplied in exceptions are not escaped.

@evverx
Copy link
Member Author

evverx commented May 8, 2022

I'd pick json (or any other format where escaping is no longer an issue) because for example busctl dumps stuff like

{
        "type" : "method_call",
        "endian" : "l",
        "flags" : 0,
        "version" : 1,
        "cookie" : 2,
        "timestamp-realtime" : 1652039190518701,
        "sender" : ":1.147",
        "destination" : "org.freedesktop.resolve1",
        "path" : "/org/freedesktop/resolve1",
        "interface" : "org.freedesktop.resolve1.Manager",
        "member" : "ResolveHostname",
        "payload" : {
                "type" : "isit",
                "data" : [
                        0,
                        "google.com",
                        0,
                        0
                ]
        }
}

and it can be put into "advanced" dictionaries: https://github.com/matusmarhefka/dfuzzer/issues/81. The idea is to monitor the system bus, pick "valid" messages and stuff them into those dictionaries (semi-automatically hopefully)

@mrc0mmand
Copy link
Member

mrc0mmand commented May 8, 2022

That sounds definitely better, and should be relatively easily doable via https://gnome.pages.gitlab.gnome.org/json-glib/ and maybe even with https://gnome.pages.gitlab.gnome.org/json-glib/json-gvariant.html.

@mrc0mmand
Copy link
Member

Giving json_gvariant_serialize_data() a quick spin, it seems to work like a charm:

   -- Signature: (isaaai(y(b(n(q(iua{ov})v)o))x(dh))a{t(bov)})
   -- Value: (-2147483648, 'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', [[@ai []]], (byte 0x00, (false, (int16 -32768, (uint16 0, (-2147483648, uint32 0, {objectpath '/': <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>, '/': <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>, '/': <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>}), <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>), objectpath '/')), int64 -9223372036854775808, (1.7976931348623157e+308, handle 0)), {uint64 0: (false, objectpath '/', <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>), 0: (false, '/', <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>), 0: (false, '/', <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>), 0: (false, '/', <'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>)})
Serialized GVariant: [-2147483648,"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",[[[]]],[0,[false,[-32768,[0,[-2147483648,0,{"/":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"}],"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"],"/"]],-9223372036854775808,[1.7976931348623157e+308,0]],{"0":[false,"/","AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"]}]

$ echo '[-2147483648,"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",[[[]]],[0,[false,[-32768,[0,[-2147483648,0,{"/":"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"}],"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"],"/"]],-9223372036854775808,[1.7976931348623157e+308,0]],{"0":[false,"/","AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"]}]' | jq .
[
  -2147483648,
  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
  [
    [
      []
    ]
  ],
  [
    0,
    [
      false,
      [
        -32768,
        [
          0,
          [
            -2147483648,
            0,
            {
              "/": "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
            }
          ],
          "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
        ],
        "/"
      ]
    ],
    -9223372036854775808,
    [
      1.7976931348623157E+308,
      0
    ]
  ],
  {
    "0": [
      false,
      "/",
      "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
    ]
  }
]

That should, hopefully, be compatible with the format produced by busctl as well.

@mrc0mmand
Copy link
Member

Also, would it make sense to log only unsuccessful cases? Something like libfuzzer/AFL does - i.e. log only crashes/timeouts, once such case per file, so they can be then used as 'reproducers' later. Or do we want to log everything into one file, marked by a type of fail (timeout, crash, ...)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants