Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add loggly hook. #44

Open
wants to merge 2 commits into
base: v2
Choose a base branch
from
Open

Conversation

GeertJohan
Copy link

Fixes #42

@inconshreveable
Copy link
Owner

Sorry for the insanely slow turnaround here.

LGTM. Do we have thoughts on the following:

  1. Do we want to add a third-party dependency (or vendor it)?
  2. Could it go in log15/loggly instead of log15/ext/loggly?
  3. Should it be taking a token in the constructor instead of a *loggly.Client? The way it's currently wired up, it's not possible for a user to specify tags to the loggly.Client constructor.
  4. Should it flush after every message or if it should be configured to flush on an configured interval?

@GeertJohan
Copy link
Author

  1. Good point. How do you propose we vendor it?
  2. I think it could, but do you really want that? Eventually there might be a lot of external logging hooks... (I would be glad to see as many as possible)
  3. I chose this method because a) this reflects the loggly.New(token) constructor and b) the Client is an exported field on LogglyHook anyway, so tags can still be changed after using NewLogglyHook(token), as you would do after using loggly.New(token). But this simple/short constructor gives you the convenience to build a handler tree really fast.
  4. I've been thinking about this too. I notice latency in our applications because of the synchronous HTTP request to loggly. If you're okay with it I would like to send an extra PR later that adds async and bulk logging (both configurable, with async enabled by default)

@GeertJohan
Copy link
Author

Any update on this? Would like to get it merged :)

"message": r.Msg,
"level": r.Lvl.String(),
"timestamp": r.Time,
"data": data,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since loggly.Message is a map[string]interface{} I am curious why the "data" value contains a nested map instead of flattening the structure?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loggly provides excellent searching on nested data. I wanted to add level, message and timestamp because that is expected behavior, but I didn't want it to conflict with any data the user might have set.

@ChrisHines
Copy link
Collaborator

  1. If we vendor go-loggly, are we prepared to maintain it?
  2. I vote for keeping this in log15/ext/loggly.
  3. I prefer the constructor that takes a Client rather than a token.
  4. See my inline comments above.

@GeertJohan
Copy link
Author

  1. See below.
  2. I agree, I think it's cleaner to gather logging hooks to external systems in a subdirectory so it's not polluting the project root.
  3. Still think a simple constructor taking a token is better, it also avoids having to import segmentio/go-loggly in the main project. Also: see below.

One of the key features of log15 is the way Handlers connect and create a handling tree. I think properties such as async, buffering (with max buffersize and flush interval) should be handled by a Handler, and the hook to a logging service shouldn't have to implement buffering itself and should just send logs to the logging service.

Now ofcourse the whole purpose of the buffering in go-loggly is so that can send multiple log entries in a single call to loggly. And this is actually very important, especially when working with huge amounts of logs. But buffering and bulk writes are also important in other handlers. For instance, a colleague of mine found that some of our application spend a huge amount of time in the kernel due to writes to files and os.Stdout. So a generic approach to buffering and handling of bulk log records seems to be an interesting topic. We brainstormed a bit about it this midday and came up with the following idea:

Add an extra interface type:

type BulkHandler interface{
    LogBulk([]*Record) error
}

Some Handlers such as the LogglyHandler and the FileHandler could implement this method, taking a slice of logs and sending/writing them all at once, or at least as much as possible/efficient per call/write. Loggly's bulk endpoint states "You can send a maximum of 5MB per batch and up to 1MB per event", so the LogglyHandler will need to take care of that.

Then a new Handler would be added:

type bufferedBulkHandler struct {
    buffer []*Record
    handler Handler
    bufferSize int // buffer is this size large, and will be flushed when it's full
    flushInterval time.Duration
    forceFlushLevel Level // buffer is flushed directly (while blocking the logging call) after a message with this level (or higher) is added.
}

func BufferedBulkHander(size int, interval time.Duration, forceFlushLevel Level, handler Handler) Handler

The ForceFlushLevel is actually quite important, there's a good chance a program is crashing when a log with LvlCrit is added. And you wouldn't want that message to be stuck in the buffer when the program crashes. Normal flushing takes place asynchronously, not directly blocking a Log call.

Maybe in this case (because of the number of options) the BufferedBulkHandler should be an exported type with exported fields, and not a function.

What I propose is:

  • Pull in the go-loggly Client and merge it with the LogglyHandler but drop all buffering and async features.
  • Define the BulkHandler interface
  • Add the BufferedBulkHandler
  • Implement BulkHandler for the FileHandler and LogglyHandler, and possibly more.

What I'm not sure about is if BufferedBulkHandler should accept a Handler as argument, and check with type assertion if the Handler also implements BulkHandler. Or maybe it should just accept BulkHandler's.

@ChrisHines
Copy link
Collaborator

Thanks for the detailed thoughts, and I like where this is headed! I like how you are breaking this down into nicely composable pieces. I think you have some great ideas here that could add some nice capabilities to log15 as a whole. I would be happy to work with you to polish your ideas and get them into log15. How about we break this into multiple issues. This PR can be for implementing the simple loggly handler. Then let's create a new issue for the bulk logging idea.

@GeertJohan
Copy link
Author

Great! I'll update this PR with the loggly http API asap, probably this evening (CET).

@GeertJohan
Copy link
Author

Some delay: I want to know if the log level (debug, info, crit) should be in a specific format so it is recognized by loggly. Because segmentio/go-loggy uses a different list of level strings than log15. I'm contacting loggly about this.

@GeertJohan
Copy link
Author

Nevermind, it looks like levels are not defined by loggly.

@GeertJohan GeertJohan force-pushed the ext-loggly branch 3 times, most recently from ec268ff to 51b2290 Compare March 4, 2015 14:10
@GeertJohan
Copy link
Author

@ChrisHines I just updated the PR.
I rewrote the complete code, there's no external dependency anymore.
There's no buffering anymore, the logs are being sent to Loggly immediately.
Please let me know what you think.

@GeertJohan
Copy link
Author

Ah well, there is one external dependency: github.com/pquerna/ffjson/fflib/v1. But I trust the maintainers of that project to not make any breaking changes. They even versioned this dependency and breaking changes are made in v2.

@inconshreveable
Copy link
Owner

The bulk logging stuff is interesting, I'll comment on that in the separate issue you opened for it.
I'm on board with log15/ext/loggly and the dependency on ffjson. Also, all of the code looks good. Thanks for this!

I'm concerned about the exported symbols TokenPlaceholder, EndpointSingle and EndpointBulk (especially because endpointbulk is unused at this point).

I'd propose instead that Endpoint be a public exported field of the LogglyHandler object. If the client wants to send logs to custom URIs, they can do it on a per-handler basis by setting Endpoint. This would make NewLogglyHandler a convenience constructor when you don't need any special behaviors.

@GeertJohan
Copy link
Author

@inconshreveable You're right, I'm updating the code right now.

This handler sends log15 records to loggly.
No third party dependencies are required.
@GeertJohan
Copy link
Author

There, removed the const and var's and have Endpoint as exported field.

@inconshreveable
Copy link
Owner

LGTM. Last thing we need is just a test. We could set up an HTTP server in the test and then set up a loggly handler with a custom URL pointing at the local server and validate that we get properly formatted loggly data.

@ChrisHines
Copy link
Collaborator

From Loggly's website:

Any sort of log data can be sent to Loggly, it just needs to be text based. Unstructured or structured logs from any application or device can be forwarded.

Which raises the question: Should the LogglyHandler accept a log15.Format rather than only using JSON?

@GeertJohan
Copy link
Author

Loggly strongly recommends sending JSON for this type of thing. They do indeed accept all kind of logs so that you can pipe apache, nginx, syslog, etc. etc. directly to their API. They then analyse the log and parse it as best as they can to make it search-able, so you can put filters on fields and their values. But since we already have structured data in the form of log15.Ctx, it's best to send it as JSON so we're sure the structure is copied into the loggly the same way. It wouldn't make sense to use any other formatter and then just hope loggly manages to parse the text to it's original structure.

@GeertJohan
Copy link
Author

I've pushed a work in progress test, but it's not working.. Looks like a problem with the channel (might be a Go issue on tip). To be continued later.

@ChrisHines
Copy link
Collaborator

I understand, but as library authors I think we should give the application developer that choice. Plus, we already have log15.JsonFormat. So I believe we should support any log15.Format and document that JsonFormat is the preferred format for use with Loggly. Supporting a Format makes LogglyHandler parallel to other sink handlers (FileHandler, StreamHandler, NetHandler), and also gives LogglyHandler users a hook to taylor the format if they have special requirements.

@GeertJohan
Copy link
Author

It really makes no sense to send anything other than JSON to Loggly, unless you don't have JSON. For the File, Stream and Net handlers this does make sense, because in some cases you want human-readable data, so thats why you use a Formatter. But in this case Loggly already parses the data and presents it through their webinterface. JSON is a very reliable format for data transfer, especially with the standardized encoding of bytes, times, decimals and strings (escaping), and there's no reason to use anything else.

I do like the idea of allowing the user to modify the data sent, but I think the use of that feature will be quite low. And using Formatter has a huge disadvantage: it causes allocations.
Especially with the upcoming bulk handling, I think it's very important to make sure we don't allocate a lot of bytes, and with the Formatter you'd have to allocate the bytes that are returned by Format, and then copy those to the buffer.
To allow users to modify the request before marshaling it onto the buffer we could add a field PreMarhshalHook func(r *log15.Record). Then we wouldn't have an allocation and the json.Encoder can write directly to the pre-allocated buffer.

@GeertJohan
Copy link
Author

If someone does really require to set a custom Formatter for some reason, then they can either fork the LogglyHandler and add it with a few lines of code. And when there is a greater demand (which I would be willing to bet more than a few beers on that it won't happen), a Formatter field can always be added to the LogglyHandler later 👍

What do you think about PreMarshalHook func(r *log15.Record) ?

@inconshreveable
Copy link
Owner

Agreed, I see no reason to send anything over than JSON. The format is irrelevant in this case because we're sending to a machine. The goal is no loss of fidelity, and JSON does that well, and is recommended by loggly. Even if we're wrong on that, it covers the 90% case and we'll learn from someone if we're wrong.

I'm ready to merge when we have a passing test.

@ChrisHines
Copy link
Collaborator

I concede the point on making JSON the default format, but I'm still not convinced that the formatting should be hard coded into the handler. I will make my case below, and then I am fine with the majority opinion.

Suppose someone constructs the following handler tree.

h := FailoverHandler(
    LogglyHandler(),
    FileHandler("path", JsonFormat()),
)

The intention is to save logs to disk if Loggly is unreachable and batch upload the files later when Loggly is again reachable. The problem is that the logs on disk are in a different flavor of JSON than logs sent on the happy path.

So I would like to see one of two outcomes before we merge:

  1. LogglyHandler uses JsonFormat internally.
  2. A new LogglyFormat is factored out to allow for the above scenario to work as intended.

@inconshreveable
Copy link
Owner

Agreed. I'd prefer to use JsonFormat internally, if possible. Are we using the go generated JSON encoding just for speed, or is there another reason? Otherwise I'm fine with exposing a loggly.JSONFormat

@GeertJohan
Copy link
Author

@ChrisHines Good point, I can definitely see that as a use-case, especially in larger cluster systems that should be able to handle connectivity downtime without losing information.

@inconshreveable I would prefer a LogglyFormat because I think the way I structured the JSON makes more sense for use with Loggly. An extra bonus is that the internal loggly formatter can work with an io.Writer under the hood so that it can directly write to a buffer when that is implemented in the LogglyHandler. The LogglyFormat will wrap the internal formatter with a bytes.Buffer and return the byte slice.

Maybe having a FormattingWriter is something to think about.

type FormattingWriter interface {
    WriteFormatted(w io.Writer, r *Record)
}

TerminalFormat and LogfmtFormat already use a bytes.Buffer{} internally so handing it a different io.Writer should work. JsonFormat now uses json.Marshal which in turn wraps a json.Encoder + bytes.Buffer{}. So in the end, every formatter could actually work with an io.Writer without much change, and that makes a lot of sense with the buffering in #49. Anyway, this should probably be a new issue.

@ChrisHines ChrisHines mentioned this pull request Mar 8, 2015
@phylake
Copy link

phylake commented Jun 7, 2015

I'm eagerly awaiting this merge. Any movement on this?

@inconshreveable
Copy link
Owner

This is what we were waiting on:

"I've pushed a work in progress test, but it's not working.. Looks like a problem with the channel (might be a Go issue on tip). To be continued later."

If you'd like to write a test that accurately tests the functionality and works I'm happy to merge this in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants