Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[eem] _count api #203605

Merged
merged 20 commits into from
Dec 12, 2024
Merged

[eem] _count api #203605

merged 20 commits into from
Dec 12, 2024

Conversation

klacabane
Copy link
Contributor

@klacabane klacabane commented Dec 10, 2024

implements countEntities API

the query to count a single-source definition is straightforward but gets tricky when sources > 1 because we have to resolve entity ids to avoid counting duplicates. I've reused the entity.id/source eval logic implemented here https://github.com/elastic/elastic-entity-model/issues/202#issuecomment-2500608664

@klacabane klacabane self-assigned this Dec 10, 2024
@klacabane klacabane added release_note:skip Skip the PR/issue when compiling release notes backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) Team:obs-entities Observability Entities Team labels Dec 10, 2024
@klacabane klacabane marked this pull request as ready for review December 10, 2024 15:39
@klacabane klacabane requested a review from a team as a code owner December 10, 2024 15:39
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-entities (Team:obs-entities)

@elastic-vault-github-plugin-prod elastic-vault-github-plugin-prod bot requested a review from a team as a code owner December 10, 2024 15:52
@@ -0,0 +1,17 @@
/*
Copy link
Member

@pheyos pheyos Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider moving these helpers to a test service (e.g. like this one)? Then then you could use the es and supertest services directly and would not have to pass the client instances around.

Copy link
Contributor Author

@klacabane klacabane Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not and that's a great suggestion, I'll likely address that in a follow up to include few utilities/helpers spread in our tests

Copy link
Member

@pheyos pheyos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x-pack/test/tsconfig.json changes LGTM

@miltonhultgren
Copy link
Contributor

I've reused the entity.id/source eval logic implemented here https://github.com/elastic/elastic-entity-model/issues/202#issuecomment-2500608664

I know we're not caring about performance too much right now, but how does this perform for a smaller query with no other aggs?

Comment on lines +45 to +49
sourceFilters.push(
source.index_patterns
.map((pattern) => `_index: "${pattern}*" OR _index: ".ds-${last(pattern.split(':'))}*"`)
.join(' OR ')
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Contributor Author

@klacabane klacabane Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contrived example but take these two sources sharing the same type

  • { patterns: "foo*", identity: "service.name", filters: "service.name: foo" }
  • { patterns: "bar*", identity: "service.name", filters: "service.name: bar" }

now say foo* is empty and bar* contains { service.name: "foo" } and { service.name: "bar" }. without the index filter we'd count a total of 2 entities when according to the sources it should be 1

x-pack/test/api_integration/apis/entity_manager/count.ts Outdated Show resolved Hide resolved
@klacabane
Copy link
Contributor Author

klacabane commented Dec 11, 2024

I know we're not caring about performance too much right now, but how does this perform for a smaller query with no other aggs?

not good. I've tried against the rally datasets (one source set against 5m documents, the other on 1m documents) both datasets report the same 100 entities and it takes ~8sec, for one source set on 1m documents the other one on 100k we get ~2s. It's not too concerning for now since all our definitions are single source but definitely something to improve. I'll include the queries in the rally track to have a better view

Copy link
Contributor

@rmyz rmyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for applying the suggestions we talked about 🚀

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

cc @klacabane

@klacabane klacabane merged commit 0350618 into elastic:main Dec 12, 2024
8 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/12298399966

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Dec 12, 2024
implements `countEntities` API

the query to count a single-source definition is straightforward but
gets tricky when sources > 1 because we have to resolve entity ids to
avoid counting duplicates. I've reused the entity.id/source eval logic
implemented here
elastic/elastic-entity-model#202 (comment)

---------

Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit 0350618)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Dec 12, 2024
# Backport

This will backport the following commits from `main` to `8.x`:
- [[eem] _count api
(#203605)](#203605)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Kevin
Lacabane","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-12-12T14:25:20Z","message":"[eem]
_count api (#203605)\n\nimplements `countEntities` API\r\n\r\nthe query
to count a single-source definition is straightforward but\r\ngets
tricky when sources > 1 because we have to resolve entity ids
to\r\navoid counting duplicates. I've reused the entity.id/source eval
logic\r\nimplemented
here\r\nhttps://github.com/elastic/elastic-entity-model/issues/202#issuecomment-2500608664\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<[email protected]>","sha":"03506185f9acac329fdfec13540c02a3d82c1636","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor","Team:obs-entities"],"title":"[eem]
_count
api","number":203605,"url":"https://github.com/elastic/kibana/pull/203605","mergeCommit":{"message":"[eem]
_count api (#203605)\n\nimplements `countEntities` API\r\n\r\nthe query
to count a single-source definition is straightforward but\r\ngets
tricky when sources > 1 because we have to resolve entity ids
to\r\navoid counting duplicates. I've reused the entity.id/source eval
logic\r\nimplemented
here\r\nhttps://github.com/elastic/elastic-entity-model/issues/202#issuecomment-2500608664\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<[email protected]>","sha":"03506185f9acac329fdfec13540c02a3d82c1636"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/203605","number":203605,"mergeCommit":{"message":"[eem]
_count api (#203605)\n\nimplements `countEntities` API\r\n\r\nthe query
to count a single-source definition is straightforward but\r\ngets
tricky when sources > 1 because we have to resolve entity ids
to\r\navoid counting duplicates. I've reused the entity.id/source eval
logic\r\nimplemented
here\r\nhttps://github.com/elastic/elastic-entity-model/issues/202#issuecomment-2500608664\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<[email protected]>","sha":"03506185f9acac329fdfec13540c02a3d82c1636"}}]}]
BACKPORT-->

Co-authored-by: Kevin Lacabane <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) release_note:skip Skip the PR/issue when compiling release notes Team:obs-entities Observability Entities Team v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants