Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/ottl] enablement for an unroll function/array expansion #36507

Open
schmikei opened this issue Nov 22, 2024 · 1 comment
Open

[pkg/ottl] enablement for an unroll function/array expansion #36507

schmikei opened this issue Nov 22, 2024 · 1 comment
Labels
enhancement New feature or request pkg/ottl processor/transform Transform processor

Comments

@schmikei
Copy link
Contributor

Component(s)

pkg/ottl, processor/transform

Is your feature request related to a problem? Please describe.

The general problem I have is that I have log data that I'd like to transform based off a separator, in my case \n within a string as the data is being sent to me.

The transformprocessor enables me to split my log on newlines however it's all one entry still just with a singular slice body

receivers:
  filelog:
    include: [ ./test.json ]
    start_at: beginning
processors:
  transform:
    log_statements:
      - context: log
        statements:
          - set(body, Split(body, "\\n"))

What I'd like to be able to do is once I've split, be able to unroll this resulting array into new log entries

Example Log Line <20>Oct 24 15:16:15 schmeler2853 inventore[8729]: We need to reboot the 1080p IB firewall!\n<162>Oct 24 15:16:16 ruecker1023 optio[97]: Navigating the microchip won't do anything, we need to program the multi-byte XML card!
After Split
{
    "resourceLogs": [
        {
            "resource": {},
            "scopeLogs": [
                {
                    "scope": {},
                    "logRecords": [
                        {
                            "observedTimeUnixNano": "1732299779758008000",
                            "body": {
                                "arrayValue": {
                                    "values": [
                                        {
                                            "stringValue": "\u003c20\u003eOct 24 15:16:15 schmeler2853 inventore[8729]: We need to reboot the 1080p IB firewall!"
                                        },
                                        {
                                            "stringValue": "\u003c162\u003eOct 24 15:16:16 ruecker1023 optio[97]: Navigating the microchip won't do anything, we need to program the multi-byte XML card!"
                                        }
                                    ]
                                }
                            },
                            "attributes": [
                                {
                                    "key": "log.file.name",
                                    "value": {
                                        "stringValue": "test.json"
                                    }
                                }
                            ],
                            "traceId": "",
                            "spanId": ""
                        }
                    ]
                }
            ]
        }
    ]
}

What I'd like to do next is implement some kind of function that creates new events based off that array i.e.

- unroll(body)

Result
{
    "resourceLogs": [
        {
            "resource": {},
            "scopeLogs": [
                {
                    "scope": {},
                    "logRecords": [
                        {
                            "body": {
                                "arrayValue": {
                                    "values": [
                                        {
                                            "stringValue": "\u003c20\u003eOct 24 15:16:15 schmeler2853 inventore[8729]: We need to reboot the 1080p IB firewall!"
                                        }
                                    ]
                                }
                            },
                            "attributes": [
                                {
                                    "key": "log.file.name",
                                    "value": {
                                        "stringValue": "test.json"
                                    }
                                }
                            ],
                            "traceId": "",
                            "spanId": ""
                        },
                        {
                            "body": {
                                "arrayValue": {
                                    "values": [
                                        {
                                            "stringValue": "\u003c162\u003eOct 24 15:16:16 ruecker1023 optio[97]: Navigating the microchip won't do anything, we need to program the multi-byte XML card!"
                                        }
                                    ]
                                }
                            },
                            "attributes": [
                                {
                                    "key": "log.file.name",
                                    "value": {
                                        "stringValue": "test.json"
                                    }
                                }
                            ],
                            "traceId": "",
                            "spanId": ""
                        }
                    ]
                }
            ]
        }
    ]
}

Describe the solution you'd like

Since what I'm looking for is some kind of editor function that would be able to take an event and expand the log slice based off each individual value of an array specified within the LogsContext; however I imagine this could be useful in any of the telemetry contexts.

  • unroll(attributes["foo"])

This is sort of the inverse of what the aggregate_on_attributes function is doing in the metrics context, but for log slices.

Describe alternatives you've considered

I've glanced briefly at the transformprocessor directly and think we could maybe just solve it there; however some reprocessing of log entries is still making me hesitant if that's the correct place #36506. I'm not entirely sure where the best place to implement such a feature (I've looked briefly at implementing generically in OTTL and could not think of a good way to not re-iterate over the expanded logs with our current OTTL implementation). Ideally I'm looking for some guidance on if this is something we can/should do with OTTL or what the alternative solution we could use to handle this potential processor problem!

Additional context

Important

Not saying there's anything inherently wrong with the implementation of OTTL, I'm just creating this issue seeking some guidance on what is the recommended way of solving this processing scenario! If we want to solve it generically using the OTTL framework, was hoping to start identifying any next steps we could take to get an OTTL solution if thats the correct place to add the desired functionality.

@schmikei schmikei added enhancement New feature or request needs triage New item requiring triage labels Nov 22, 2024
@github-actions github-actions bot added pkg/ottl processor/transform Transform processor labels Nov 22, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pkg/ottl processor/transform Transform processor
Projects
None yet
Development

No branches or pull requests

2 participants