Skip to content

Commit

Permalink
Update documentation and Blog
Browse files Browse the repository at this point in the history
  • Loading branch information
znwar committed Jan 23, 2023
1 parent ca567a0 commit f2d807f
Show file tree
Hide file tree
Showing 3 changed files with 246 additions and 52 deletions.
67 changes: 67 additions & 0 deletions blog/2022-12-11-LFX-Blog-Rebecca.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
title: Support for the Pluggable Logging in Tremor
author: Rebecca ABLI
author_title: Tremor 2022 Summer Mentee
tags: ["cncf", "mentorship", "pluggable-logging"]
draft: false
description: "Rebecca's Mentorship experience"
---


# Introduction

Hello, I am Rebecca, student in computer science in the south of France. I contributed to the Tremor project during the 2022 summer on the [pluggable logging][pd] feature proposed by the [LFX Mentorship Program][lfx]. I was supported by Darach Ennis and Ramona Łuczkiewicz but I also received help from other Tremor team members like Heinz Gies and Matthias Wahl.
This blog summarizes my journey as a mentee contributing to the Tremor project.

[pd]: https://mentorship.lfx.linuxfoundation.org/project/1218d516-45af-46a3-977b-e5a9de818cec
[lfx]: https://lfx.linuxfoundation.org/tools/mentorship/


# About Tremor

Tremor is a stream processing system, which uses a set of connectors and pipelines to process the data it receives (from itself or from other systems).
Initially, Tremor used Log4rs to manage its logs, but this was costly in several ways. The pluggable-logging solution was intended to give Tremor the ability to process its logs through its own pipelines.
It was also intended to allow the management of different sources.


# Problems encountered

First of all, the challenges linked to the learning of Rust, among others, raised by the specificities of the language (ownership, borrow checking, etc.) to which I had to adapt, pushed me to leave aside my bad habits coming from other languages, in favor of a code written with a more "Rustacean" spirit. Besides that, it is especially the fact that I was not being able to communicate fluently in English during weekly meetings that slowed down my understanding of the work and made me feel lost above all. Being in an intermediate level, the communication between me and my mentors Darach and Ramona was not easy at the very beginning, but they were very patient and understanding, and even started to schedule two meetings a week instead of one, so that I could improve my English, and progress more easily with more guidance.


# Pluggable-logging

## What do we do first?

Humm well, by understanding Tremor and learning the key notions of the different languages that will be used.

I was first introduced to the behavior and features of Tremor (connectors, pipelines, scripts) and more specifically to how the "metrics system" worked. I was given the task to write small *.troy files to learn how to link connectors and pipelines from a user point of view.
On the Rust side, after studying the "Log4rs" crate, I got familiar with the notions of testing and error handling.

## Let's start coding!

After getting myself more familiar with Rust, Troy, [Log4rs][log] and the metrics system which is very similar to the system I should implement, I started by writing a channel for the data to flow through Tremor. Then came the problem of the way the data could be retrieved.
To solve this problem, I looked into creating a connector specific to logs. This was kind of cool (not too hard, not too easy) because I could rely on the existing code. I also had the help of team members and other people on the Discord server.
The good thing about Tremor's codebase in the case of my project, is that there were always a line of code that I could reuse for my needs.

It's good to be able to collect data, but for what purpose?

Actually, in addition to emitting log messages in a structured form as "Tremor objects"s (with formatted message, origin, severity (info, warn, etc.), etc.), the logging functions returns the said "Tremor objects"s, thus being able to be used directly following the invocation of the logging function.

The formatting convention ("named" & "positional") has been thought to be as intuitive and permissive as possible, allowing to manage several usage scenarios (arguments to be formatted with or without var-args, with order, without order with keys, etc.).
It allows to delegate the aggregation of the items passed in parameters to the underlying Rust code, and thus to allow a simpler and more intuitive usage of the logging functions in a Troy file.

[log]: https://crates.io/crates/log4rs

# Testing

Several integration tests with similar scenarios have been written, and can be seen as concrete examples.
They show the ability to output logs, filter logs retrieved from the `logging` connector, and also to use the structured return value from the functions.
Unit tests are not necessarily useful for end users, but they were very useful at times when I was lost between barely stable features (those under development, those to be reviewed, and those missing), to prove me that the functions I wrote worked.


# Conclusion

My objectives at the beginning of the project were to learn Rust and to improve my English level. Today I can proudly say that my goals have been reached: my level in English is much better even if I still cannot follow Darach and I know enough to be able to code in Rust (even if I have some Rust-specific features left to learn). I also have a clearer vision of what I should expect and how I should behave about my work.
And I owe all this to my mentors.

225 changes: 177 additions & 48 deletions docs/guides/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,7 @@ The logging connector can also be used with the [`otel connector`](../reference/
define flow logging_flow
flow
use integration;
use tremor::pipelines;
use tremor::connectors;
use tremor::{connectors, pipelines};
define connector write_file from file
args
Expand Down Expand Up @@ -217,66 +216,64 @@ flow
#Connections
connect /connector/logging to /pipeline/logging_pipeline;
connect /pipeline/logging_pipeline to /connector/writer;
connect /pipeline/logging_pipeline to /connector/writer;
end;
define flow server
flow
use integration;
use tremor::pipelines;
use tremor::connectors;
define connector otel_server from otel_server
with
config = {
"url": "127.0.0.1:4317",
}
end;
use integration;
use tremor::{connectors, pipelines};
define connector otel_server from otel_server
with
config = {
"url": "127.0.0.1:4317",
}
end;
# Instances of connectors to run for this flow
create connector data_out from integration::write_file;
create connector otels from otel_server;
create connector data_out from integration::write_file;
create connector otels from otel_server;
create connector exit from integration::exit;
create connector stdio from connectors::console;
create connector stdio from connectors::console;
# create pipeline passthrough;
create pipeline passthrough from pipelines::passthrough;
# create pipeline passthrough;
create pipeline passthrough from pipelines::passthrough;
# Connections
connect /connector/otels to /pipeline/passthrough;
connect /pipeline/passthrough to /connector/stdio;
connect /pipeline/passthrough to /connector/data_out;
# Connections
connect /connector/otels to /pipeline/passthrough;
connect /pipeline/passthrough to /connector/stdio;
connect /pipeline/passthrough to /connector/data_out;
connect /pipeline/passthrough to /connector/exit;
end;
define flow client
flow
use integration;
use tremor::pipelines;
use tremor::connectors;
define connector otel_client from otel_client
with
config = {
"url": "127.0.0.1:4317",
},
reconnect = {
"retry": {
"interval_ms": 100,
"growth_rate": 2,
"max_retries": 3,
}
}
end;
# Instances of connectors to run for this flow
create connector data_in from integration::read_file;
create connector otelc from otel_client;
create pipeline replay from pipelines::passthrough;
# Replay recorded events over otel client to server
connect /connector/data_in to /pipeline/replay;
use integration;
use tremor::{connectors, pipelines};
define connector otel_client from otel_client
with
config = {
"url": "127.0.0.1:4317",
},
reconnect = {
"retry": {
"interval_ms": 100,
"growth_rate": 2,
"max_retries": 3,
}
}
end;
# Instances of connectors to run for this flow
create connector data_in from integration::read_file;
create connector otelc from otel_client;
create pipeline replay from pipelines::passthrough;
# Replay recorded events over otel client to server
connect /connector/data_in to /pipeline/replay;
connect /pipeline/replay to /connector/otelc;
end;
Expand All @@ -290,3 +287,135 @@ deploy flow client;

The logging connector can also be used with the [`elastic connector`](../reference/connectors/elastic.md)

```tremor
#
define flow main
flow
use std::time::nanos;
use integration;
use tremor::{connectors, pipelines};
define pipeline main
pipeline
define script process_batch_item
script
# setting required metadata for elastic
let $elastic = {
"_index": "1",
"action": event.action
};
let $correlation = event.snot;
match event of
case %{present doc_id} => let $elastic["_id"] = event.doc_id
case _ => null
end;
event
end;
create script process_batch_item;
define operator batch from generic::batch with
count = 6
end;
create operator batch;
select event from in into process_batch_item;
select event from process_batch_item into batch;
select event from batch into out;
select event from process_batch_item/err into err;
end;
define pipeline logging_pipeline
into
out, err, exit
pipeline
use tremor::logging;
select match event of
case %{level == "TRACE" } => {"snot": "badger", "action": "update", "doc_id": "badger"}
case %{level == "DEBUG" } => {"snot": "badger", "action": "null", "doc_id": "badger"}
case %{level == "INFO" } => {"snot": "badger", "action": "update", "doc_id": "badger"}
case %{level == "WARN" } => {"snot": "badger", "action": "null", "doc_id": "badger"}
case %{level == "ERROR" } => {"snot": "badger", "action": "delete", "doc_id": "badger"}
case _ => "exit"
end
from in into out;
select event
from in into exit;
end;
define pipeline response_handling
pipeline
select {
"action": $elastic.action,
"success": $elastic.success,
"payload": event.payload,
"index": $elastic["_index"],
"correlation": $correlation
}
from in where $elastic.success into out;
select {
"action": $elastic.action,
"payload": event.payload,
"success": $elastic.success,
"index": $elastic["_index"],
"correlation": $correlation
}
from in where not $elastic.success into err;
end;
define connector elastic from elastic
with
config = {
"nodes": ["http://127.0.0.1:9200/"],
"include_payload_in_response": true
}
end;
define connector input from cb
with
config = {
"paths": ["in1.json", "in2.json"],
"timeout": nanos::from_seconds(5),
"expect_batched": true,
}
end;
# Instances of connectors to run for this flow
create connector errfile from integration::write_file
with
file = "err.log"
end;
create connector okfile from integration::write_file
with
file = "ok.log"
end;
create connector exit from integration::exit;
create connector stdio from connectors::console;
create connector elastic;
define connector logging from logs;
create connector logging;
# Instance of pipeline to run for this flow
create pipeline main;
create pipeline response_handling;
create pipeline logging_pipeline;
# Connections
connect /connector/logging to /pipeline/logging_pipeline;
connect /pipeline/logging_pipeline to /pipeline/main/in;
connect /pipeline/main/out to /connector/elastic/in;
connect /connector/elastic/out to /pipeline/response_handling/in;
connect /pipeline/response_handling/out to /connector/okfile/in;
connect /pipeline/response_handling/out to /connector/stdio/in;
connect /pipeline/response_handling/exit to /connector/exit/in;
connect /connector/elastic/err to /pipeline/response_handling/in;
connect /pipeline/response_handling/err to /connector/errfile/in;
connect /pipeline/response_handling/err to /connector/stdio/in;
end;
deploy flow main;
```
6 changes: 2 additions & 4 deletions docs/reference/connectors/logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,7 @@ The `logging` connector collects and forwards of various kind of logs, which can
We use the standard definition of the logging connector of the standard library and the passthrough

```tremor
use tremor::pipelines;
use tremor::connectors;
use tremor::{connectors, pipelines};
create connector logging from connectors::logs;
...
Expand Down Expand Up @@ -50,8 +49,7 @@ Capture and redirect system logs and redirect to standard output
```tremor
define flow logging_flow
flow
use tremor::pipelines;
use tremor::connectors;
use tremor::{connectors, pipelines};
# define and create logging connectors
define connector logging from logs;
Expand Down

0 comments on commit f2d807f

Please sign in to comment.