-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8. The order of different SequenceModels is unpredictable #10
Comments
Could you please provide an example? I assume the issue is about SequenceModelElements that follow a FirstMatchModelElement. The order should always be deterministic, so running the parsergenerator multiple times should always result in the same parser. The sorting should be based on lexicographic ordering of the elements of the models inside the SequenceModelElements. |
Maybe the comment was not exactly right. SequenceModels are always the same for the same set and the same ordering of the set. However the ordering is of the SequenceModels is using the first seen elements first. It does not consider frequency of occurences. Rare sequences should be placed last to get better performance. This is an advanced problem and should be considered to implement as the set of logs is fixed and limited. Decisions can be based on the whole list of logs instead of FIFO. Test 3 must set fixed ordering of the first elements to produce consistent models. |
This should not be the case. SequenceModels are always ordered in reverse lexicographic ordering. Example: Input (note that I disabled to aggregate fixed elements to create the SequenceModels): Parser: It is visible in the parser that the SequenceModels are ordered by the content of the first element: aaa-aa-a. This is necessary to ensure that the AMiner attempts to enter more specific paths first. I agree that it would be good for performance to have the most frequent paths first, but this could create issues with the AMiner entering incorrect paths and lead to unparsed logs. Let us leave this issue open to find a solution that combines the advantages of both strategies in the future. |
I have added another unittest for the reverse lexicographic ordering. Considering this rule the other generated models are also probably working fine. This should also be tested with subtrees. Please leave this also open for testing the subtrees. |
The order of different SequenceModels is unpredictable. There is potential to create a better model by queueing the more frequent SequenceModels first.
The text was updated successfully, but these errors were encountered: