From 5a873ab6acc664b5e741977465219ce9d4e822d8 Mon Sep 17 00:00:00 2001 From: Srikanth Govindarajan Date: Wed, 27 Mar 2024 10:16:46 -0700 Subject: [PATCH] Add split event processor for dataprepper (#6682) * Add split event processor for dataprepper Signed-off-by: srigovs * Adding usage example * Update split-event.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --------- Signed-off-by: srigovs Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: Nathan Bower --- .../configuration/processors/split-event.md | 52 +++++++++++++++++++ 1 file changed, 52 insertions(+) create mode 100644 _data-prepper/pipelines/configuration/processors/split-event.md diff --git a/_data-prepper/pipelines/configuration/processors/split-event.md b/_data-prepper/pipelines/configuration/processors/split-event.md new file mode 100644 index 0000000000..2dbdaf0cc0 --- /dev/null +++ b/_data-prepper/pipelines/configuration/processors/split-event.md @@ -0,0 +1,52 @@ +--- +layout: default +title: split-event +parent: Processors +grand_parent: Pipelines +nav_order: 56 +--- + +# split-event + +The `split-event` processor is used to split events based on a delimiter and generates multiple events from a user-specified field. + +## Configuration + +The following table describes the configuration options for the `split-event` processor. + +| Option | Type | Description | +|------------------|---------|-----------------------------------------------------------------------------------------------| +| `field` | String | The event field to be split. | +| `delimiter_regex`| String | The regular expression used as the delimiter for splitting the field. | +| `delimiter` | String | The delimiter used for splitting the field. If not specified, the default delimiter is used. | + +# Usage + +To use the `split-event` processor, add the following to your `pipelines.yaml` file: + +``` +split-event-pipeline: + source: + http: + processor: + - split_event: + field: query + delimiter: ' ' + sink: + - stdout: +``` +{% include copy.html %} + +When an event contains the following example input: + +``` +{"query" : "open source", "some_other_field" : "abc" } +``` + +The input will be split into multiple events based on the `query` field, with the delimiter set as white space, as shown in the following example: + +``` +{"query" : "open", "some_other_field" : "abc" } +{"query" : "source", "some_other_field" : "abc" } +``` +