Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoint data volume reduction mechanisms #5771

Open
1 of 3 tasks
ferullo opened this issue Sep 3, 2024 · 15 comments
Open
1 of 3 tasks

Endpoint data volume reduction mechanisms #5771

ferullo opened this issue Sep 3, 2024 · 15 comments
Assignees
Labels
blocked An issue that's currently blocked because it’s pending info or action from stakeholders. Docset: ESS Issues that apply to docs in the Stack release Docset: Serverless Issues for Serverless Security documentation Improvements or additions to documentation Effort: Large Issues that require significant planning, research, writing, and testing Feature: Elastic Defend Team: EDR Workflows Formerly Defend Workflows, Onboarding and Lifecycle Management Team: Endpoint Endpoint related issues v8.15.0 v8.16.0 v8.17.0

Comments

@ferullo
Copy link
Collaborator

ferullo commented Sep 3, 2024

We've recently been reducing Endpoint's data volume and three approaches are worth documenting for users so they can understand how to turn them off (i.e. restore the old behavior) if they want. Can this please be documented on a new page, with the info below.

We need to link to this documentation from an 8.16 Kibana so the documentation needs to be finished before 8.16 ships. We also need to confirm all Endpoint work is completed in 8.16 and if not update the documentation.

1. Deduplicate network events.
Starting in 8.15, when repeated network connections are detected from the same process Endpoint will not produce network events for subsequent connections. There are two advanced options to disable this and restore the 8.14 and prior behavior.

[linux|mac|windows].advanced.events.deduplicate_network_events: This will completely disable deduplication.
[linux|mac|windows].advanced.events.deduplicate_network_events_below_bytes This will enable deduplication for connections below X bytes but disable it for connections above X bytes. (In other words, suppress repeated connections for small data transfers but always emit events for large transfers)

2. Minimize host.* fieldset in event documents
Starting in 8.16, Endpoint will only include a small subset of the data in the host.* fieldset in event documents. Full host.* information will still be included in documents written to the metrics-* index pattern and in Endpoint alerts.

Users should take note of how a lack of some host.* information may affect their event filters.

@brian-mckinney can you comment on this issue with the advanced option name to turn this off and restore the 8.15 and earlier behavior?

3. Merge process and network events
Starting in 8.16, Endpoint will merge process create/terminate (Windows) and fork/exec/end (macOS/Linux) events when possible. Effectively, for short lived processes only a single event will be emitted, containing the process details from when the process terminated.

Starting in 8.16, Endpoint will merge network connection/termination (Windows/macOS/Linux) for when possible for short lived connections.

Users should take note of how this merging might affect their event filters. Notably, for merged events event.action will be an array containing all actions merged into the single event (e.g. event.action=[fork, exec, end]. For instance, if a user has an event filter to drop all fork events (event.action : fork) the filter will need to be modified or it'll also drop all merged events.

@nicholasberlin can you comment on this issue with the advanced options need to turn this off and restore the 8.15 and earlier behavior?

4. Not report MD5 and SHA1 hashes by default
As outlined in this Kibana PR elastic/kibana#193912 (it's been closed but only because it'll be merged via different PR) Endpoint will stop reporting MD5 and SHA1 hashes by default. These will still be reported if any Trusted Apps, Blocklist, Event Filters, or Alert Exceptions require them. In addition to lowering data volume this will reduce Endpoint's CPU.

The advanced options to restore the old behavior are described in the aforementioned PR.

cc @nfritts @caitlinbetz @dasansol92 @joe-desimone @gabriellandau @intxgo


Tasks

@ferullo ferullo added documentation Improvements or additions to documentation Feature: Elastic Defend Team: EDR Workflows Formerly Defend Workflows, Onboarding and Lifecycle Management Team: Endpoint Endpoint related issues labels Sep 3, 2024
@joepeeples joepeeples self-assigned this Sep 3, 2024
@nastasha-solomon
Copy link
Contributor

Leaving this note as a reminder that the Endpoint functionality that deduplicates network events might need to be added to the 8.15 release notes. I can take care of release-noting the other features mentioned in the description for 8.16, if needed.
cc: @joepeeples @natasha-moore-elastic

@nicpenning
Copy link

👀

@joepeeples joepeeples changed the title Document Endpoint data volume reduction mechanisms Endpoint data volume reduction mechanisms Sep 11, 2024
@joepeeples
Copy link
Contributor

This'll become available in serverless whenever it's done, so we'll need to publish content in the ESS docs since the doc link in serverless will point to that. Consider adding a tag "Available in serverless, coming to 8.16" etc.

@nicholasberlin
Copy link
Contributor

nicholasberlin commented Sep 17, 2024

Setting the following to false will restore 8.15 behavior, or in other words, disable aggregation.

  • linux.advanced.events.aggregate_process
  • mac.advanced.events.aggregate_process
  • windows.advanced.events.aggregate_process
  • linux.advanced.events.aggregate_network
  • mac.advanced.events.aggregate_network
  • windows.advanced.events.aggregate_network

@joepeeples joepeeples added the Effort: Medium Issues that take moderate but not substantial time to complete label Sep 20, 2024
@ferullo
Copy link
Collaborator Author

ferullo commented Sep 26, 2024

FYI @joepeeples @gabriellandau I added a section Not report MD5 and SHA1 hashes by default to this issue's description.

@joepeeples joepeeples added Effort: Large Issues that require significant planning, research, writing, and testing Docset: Serverless Issues for Serverless Security Docset: ESS Issues that apply to docs in the Stack release and removed Effort: Medium Issues that take moderate but not substantial time to complete labels Sep 30, 2024
@joepeeples
Copy link
Contributor

I’ve been drafting docs for these settings, and still need a bit more info on the “Minimize host.* fieldset” section:

  • Name of advanced setting
  • “Users should take note of how a lack of some host.* information may affect their event filters” — How exactly does this affect event filters, and what exactly should users take note of?

@brian-mckinney
Copy link

Hi 👋 I just merged https://github.com/elastic/endpoint-dev/pull/14677, which reduces the default amount of information in the host.* field for events.

Events will have the host field reduced to host.id, host.name, and host.os.type

Example of event host field prior to this PR

        "host": {
            "architecture": "x86_64",
            "hostname": "ubuntu-server",
            "id": "dabadaba-0000-0000-0000-000000000000",
            "ip": [
                "172.18.0.1",
                "172.19.0.1",
                "172.20.0.1",
                "172.22.0.1",
                "127.0.0.1",
                "::1",
                "192.168.224.128",
                "fe80::20c:29ff:fe27:b344",
                "192.168.177.100",
                "fe80::20c:29ff:fe27:b34e",
                "172.17.0.1",
                "fe80::42:cff:fef2:a076",
                "172.21.0.1",
                "fe80::42:27ff:fe5a:9a6b",
                "fe80::2899:a2ff:fef2:4a8f"
            ],
            "mac": [
                "02-42-df-b5-dc-3b",
                "02-42-29-a0-cc-05",
                "02-42-1e-17-81-7c",
                "02-42-a7-74-83-0a",
                "00-0c-29-27-b3-44",
                "00-0c-29-27-b3-4e",
                "02-42-0c-f2-a0-76",
                "02-42-27-5a-9a-6b",
                "2a-99-a2-f2-4a-8f"
            ],
            "name": "ubuntu-server",
            "os": {
                "Ext": {
                    "variant": "Ubuntu"
                },
                "family": "ubuntu",
                "full": "Ubuntu 22.04.3",
                "kernel": "5.15.0-122-generic #132-Ubuntu SMP Thu Aug 29 13:45:52 UTC 2024",
                "name": "Linux",
                "platform": "ubuntu",
                "type": "linux",
                "version": "22.04.3"
            }
        }

Example of same event host field post this PR

        "host": {
            "id": "dabadaba-0000-0000-0000-000000000000",
            "name": "ubuntu-server",
            "os": {
                "type": "linux",
            }
        }

There is an advanced configuration field that can be set, which will keep the original extended host.* field.
[linux|mac|windows].advanced.set_extended_host_information

@ferullo
Copy link
Collaborator Author

ferullo commented Oct 3, 2024

@joepeeples since this documentation will be linked to from Kibana is there a plan in place for making sure that link will take users to something that clearly explains the change. What I mean is, as we evolve documentation we'll want to keep it clear what changed in 8.16 even if users are coming from a newer stack. Is that best handled by having a special 8.16 changes page, a clear 8.16 section on a "data reduction approaches" page, or having future Kibanas just always link to the 8.16 documentation so we don't have to worry about future changes?

@intxgo
Copy link

intxgo commented Oct 4, 2024

@ferullo that last comment made me think. You're right anyone upgrading from < 8.16, to >= 8.16 ideally should be prompted about the changes. When upgrading 8.16+ to 8.16++ that information is irrelevant as it's nothing new. Just my 2 cents.

@joe-desimone
Copy link
Contributor

I started a really rough draft of a page where we could hopefully capture all of the nuance of our event capture system. It could replace this existing page. Should we collab there?

@nicholasberlin
Copy link
Contributor

nicholasberlin commented Oct 7, 2024

@joepeeples network event aggregation is being pushed to 8.17, I added strike-throughs to this comment: #5771 (comment). Process event aggregation is still 8.16

@ferullo
Copy link
Collaborator Author

ferullo commented Oct 16, 2024

@joepeeples we've decided we need to delay changing Endpoints default behavior in 8.16 until a later release. The advanced options will still exist and be tunable by users but the defaults we be hard coded to match 8.15 and prior behavior for now.

@joepeeples
Copy link
Contributor

@joepeeples since this documentation will be linked to from Kibana is there a plan in place for making sure that link will take users to something that clearly explains the change. What I mean is, as we evolve documentation we'll want to keep it clear what changed in 8.16 even if users are coming from a newer stack. Is that best handled by having a special 8.16 changes page, a clear 8.16 section on a "data reduction approaches" page, or having future Kibanas just always link to the 8.16 documentation so we don't have to worry about future changes?

@ferullo Sorry I missed this earlier (too many notifications!). I think we'll be OK if we keep linking to the current or even master version of the docs page from any future version of Kibana, because we can make it clear in the docs when specific features were added to the Stack. So even if the user is coming from Kibana 9.0 and looking at docs 9.0, they'll still see that feature.name.etc was added in 8.16 or that its default was changed in 8.17, or whatever we want to tell readers.

This might be clearer once I have a full draft to share (WIP PR). I'm planning to use version badges for each feature like this:

We can add whatever text we want into the tooltips for more explanation, and also add a callout on the page about upgrade situations like < 8.16 to >= 8.16. Agree with @intxgo that it'd also be helpful for the UI to prompt users making such upgrades about default behavior changes, just in case they don't happen to read the docs.

@joepeeples
Copy link
Contributor

We'll pause on docs for the advanced settings, and document everything once the default behavior is changed in a later release.

@natasha-moore-elastic natasha-moore-elastic added the blocked An issue that's currently blocked because it’s pending info or action from stakeholders. label Nov 14, 2024
@natasha-moore-elastic
Copy link
Contributor

Per Slack convo with the EDR team:

the fleet part is not ready so we need to bump from 8.17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked An issue that's currently blocked because it’s pending info or action from stakeholders. Docset: ESS Issues that apply to docs in the Stack release Docset: Serverless Issues for Serverless Security documentation Improvements or additions to documentation Effort: Large Issues that require significant planning, research, writing, and testing Feature: Elastic Defend Team: EDR Workflows Formerly Defend Workflows, Onboarding and Lifecycle Management Team: Endpoint Endpoint related issues v8.15.0 v8.16.0 v8.17.0
Projects
None yet
Development

No branches or pull requests

9 participants