Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate ways to reduce the analysis message matcher overhead #135

Open
1 of 3 tasks
trink opened this issue Nov 17, 2017 · 6 comments
Open
1 of 3 tasks

Investigate ways to reduce the analysis message matcher overhead #135

trink opened this issue Nov 17, 2017 · 6 comments

Comments

@trink
Copy link
Contributor

trink commented Nov 17, 2017

Some Approaches to Test

  • In-line all the matchers before performing any analysis
    • result: added complexity and required lua_sandbox API changes without demonstrating general benefit to most of the current use cases.
  • Hash router, Analyze the matchers and create a hash table lookup for all matchers keying off a particular header/field e..g Logger ==
  • Tree router, Analyze the matchers and create a hierarchy of matchers so entire groups of matchers can be eliminated by a single match
@trink
Copy link
Contributor Author

trink commented Dec 6, 2017

This is a work in progress, experimentation will continue as the schedule allows.

@cvuillemez
Copy link

+1 :)
I have very long message matchers which are bottlenecks.

@trink
Copy link
Contributor Author

trink commented May 2, 2018

There may be some things you can do to optimize specific matchers (order of expressions and types of comparisons). If you can share some problem matchers I will take a look.

The goal or the remaining items above is to handle many matchers faster (by clustering) so the large matchers would have to share some conditional expressions that could fail the entire set fast (i.e. if they are relatively unique this experimentation will not help).

@cvuillemez
Copy link

I understand this optimization only apply to analysis plugin ? (with all thread sharing the same message_matcher).
In my case I have an output plugin with a long message matcher string:
message_matcher = "(Type =~ '/AAAAA$' || Type =~ '/BBBBB$' || [ ... ] )"
For now I solved the bottleneck by splitting it into multiple instance plugins.

@trink
Copy link
Contributor Author

trink commented May 3, 2018

Yeah mozilla-services/lua_sandbox#213 is about all I can squeeze out of a single matcher.

mozilla-services/lua_sandbox#208 may be relevant if the string at the end is unique so you don't need to actually anchor it. Type =~ '/unique' is multiple times faster than Type =~ '/unique$'

@cvuillemez
Copy link

Yeah I tested and it's faster without the trailing "$" .
That's amazing, usually we could think the "$" is faster cause not all string must be parsed, just the end !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants