-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[eem] _count api #203605
[eem] _count api #203605
Conversation
Pinging @elastic/obs-entities (Team:obs-entities) |
@@ -0,0 +1,17 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider moving these helpers to a test service (e.g. like this one)? Then then you could use the es
and supertest
services directly and would not have to pass the client instances around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not and that's a great suggestion, I'll likely address that in a follow up to include few utilities/helpers spread in our tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x-pack/test/tsconfig.json
changes LGTM
I know we're not caring about performance too much right now, but how does this perform for a smaller query with no other aggs? |
x-pack/platform/plugins/shared/entity_manager/server/routes/v2/count.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/entity_manager/server/routes/v2/count.ts
Outdated
Show resolved
Hide resolved
x-pack/platform/plugins/shared/entity_manager/server/lib/v2/queries/entity_count.ts
Show resolved
Hide resolved
sourceFilters.push( | ||
source.index_patterns | ||
.map((pattern) => `_index: "${pattern}*" OR _index: ".ds-${last(pattern.split(':'))}*"`) | ||
.join(' OR ') | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contrived example but take these two sources sharing the same type
{ patterns: "foo*", identity: "service.name", filters: "service.name: foo" }
{ patterns: "bar*", identity: "service.name", filters: "service.name: bar" }
now say foo* is empty and bar* contains { service.name: "foo" }
and { service.name: "bar" }
. without the index filter we'd count a total of 2 entities when according to the sources it should be 1
not good. I've tried against the rally datasets (one source set against 5m documents, the other on 1m documents) both datasets report the same 100 entities and it takes ~8sec, for one source set on 1m documents the other one on 100k we get ~2s. It's not too concerning for now since all our definitions are single source but definitely something to improve. I'll include the queries in the rally track to have a better view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for applying the suggestions we talked about 🚀
💚 Build Succeeded
Metrics [docs]
History
cc @klacabane |
Starting backport for target branches: 8.x |
implements `countEntities` API the query to count a single-source definition is straightforward but gets tricky when sources > 1 because we have to resolve entity ids to avoid counting duplicates. I've reused the entity.id/source eval logic implemented here elastic/elastic-entity-model#202 (comment) --------- Co-authored-by: kibanamachine <[email protected]> (cherry picked from commit 0350618)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
# Backport This will backport the following commits from `main` to `8.x`: - [[eem] _count api (#203605)](#203605) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Kevin Lacabane","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-12-12T14:25:20Z","message":"[eem] _count api (#203605)\n\nimplements `countEntities` API\r\n\r\nthe query to count a single-source definition is straightforward but\r\ngets tricky when sources > 1 because we have to resolve entity ids to\r\navoid counting duplicates. I've reused the entity.id/source eval logic\r\nimplemented here\r\nhttps://github.com/elastic/elastic-entity-model/issues/202#issuecomment-2500608664\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"03506185f9acac329fdfec13540c02a3d82c1636","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor","Team:obs-entities"],"title":"[eem] _count api","number":203605,"url":"https://github.com/elastic/kibana/pull/203605","mergeCommit":{"message":"[eem] _count api (#203605)\n\nimplements `countEntities` API\r\n\r\nthe query to count a single-source definition is straightforward but\r\ngets tricky when sources > 1 because we have to resolve entity ids to\r\navoid counting duplicates. I've reused the entity.id/source eval logic\r\nimplemented here\r\nhttps://github.com/elastic/elastic-entity-model/issues/202#issuecomment-2500608664\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"03506185f9acac329fdfec13540c02a3d82c1636"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/203605","number":203605,"mergeCommit":{"message":"[eem] _count api (#203605)\n\nimplements `countEntities` API\r\n\r\nthe query to count a single-source definition is straightforward but\r\ngets tricky when sources > 1 because we have to resolve entity ids to\r\navoid counting duplicates. I've reused the entity.id/source eval logic\r\nimplemented here\r\nhttps://github.com/elastic/elastic-entity-model/issues/202#issuecomment-2500608664\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"03506185f9acac329fdfec13540c02a3d82c1636"}}]}] BACKPORT--> Co-authored-by: Kevin Lacabane <[email protected]>
implements
countEntities
APIthe query to count a single-source definition is straightforward but gets tricky when sources > 1 because we have to resolve entity ids to avoid counting duplicates. I've reused the entity.id/source eval logic implemented here https://github.com/elastic/elastic-entity-model/issues/202#issuecomment-2500608664