Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

package log #16

Closed
wants to merge 1 commit into from
Closed

Conversation

ChrisHines
Copy link
Member

First sketch of basic logging features.

@peterbourgon
Copy link
Member

@ChrisHines some high-level questions first, so I can better understand the context...

  • Can you explain the motivation for defining the API with interface{} parameters for label dimensions? Could these keyvals be reified to a more concrete type, e.g. metrics.Field?
  • Why the split between Handler and Encoder? Could these be combined to a single Writer, or even replaced altogether by io.Writer?
  • Is Logger.New semantically equivalent to e.g. Counter.With? That is, it creates and returns a new Logger with some specific context?

@ChrisHines
Copy link
Member Author

Good questions @peterbourgon. The overall design follows most closely to log15, so reading its docs will likely help.

Can you explain the motivation for defining the API with interface{} parameters for label dimensions? Could these keyvals be reified to a more concrete type, e.g. metrics.Field?

I don't know what you mean by a "label dimension", but I will explain my thinking behind interface{} based key-value pairs.

Logrus, lager, and log15 all use string keys and interface{} values. But, at least in log15, string based keys resulted in a kind of hacky approach to overriding default keys, see inconshreveable/log15#4. So with gokit/log I took some inspiration from package context which uses interface{} keys for reasons explained in the blog post. The main benefit is that we can define unambiguously unique keys for things like levels and timestamps or other things we haven't thought of yet.

Having keys that cannot accidentally conflict with keys from other packages allows handlers to confidently operate on the known key and still let apps make the final call on how those keys are emitted to the backend, thus the ReplaceKeys handler.

Why the split between Handler and Encoder? Could these be combined to a single Writer, or even replaced altogether by io.Writer?

Conceptually a Handler filters and routes structured log records in key-value form. Handlers compose with other Handlers to form arbitrarily sophisticated logging pipelines. An Encoder converts a structured log record to unstructured form ([]byte). Encoders will only appear at the end of the logging pipeline.

I have gone back and forth on this question myself. I agree that it feels like all we should need are Handlers and package io interfaces, but my attempts in that direction have resulted in less orthogonal components. So I consider it an open question. I have some thoughts on the topic written down in inconshreveable/log15#38. Also, this comment in a log15 PR discussion talks about a use case where a single encoder (format in log15 lingo) is used by multiple handlers.

Is Logger.New semantically equivalent to e.g. Counter.With? That is, it creates and returns a new Logger with some specific context?

I'm not sure. I haven't found good documentation for the semantics of Counter.With yet, and I'm not familiar enough with the metrics domain to make a good guess. Logger.New in this PR has the same semantics as Logger.New in the package docs for log15.

if err != nil {
return err
}
_, err = w.Write(b)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be protected by a mutex, or some way to serialize the writes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I left it out in this early API sketch. We could lock a mutex around the call to w.Write.

Another approach is to give options to the app code by providing a SyncHandler and an AsyncHandler. Both would wrap an arbitrary Handler. SyncHandler protects the wrapped handler via a Mutex. AsyncHandler serializes the pipeline via a (maybe buffered) channel and introduces a goroutine to run the downstream handlers asynchronously.

@peterbourgon
Copy link
Member

I don't know what you mean by a "label dimension",

Given log.Log("message", "k1", "v1", "k2", "v2"), I mean k1=v1 and k2=v2 as label-pairs, or label dimensions, on "message".


but I will explain my thinking behind interface{} based key-value pairs.

Logrus, lager, and log15 all use string keys and interface{} values. But, at least in log15, string based keys resulted in a kind of hacky approach to overriding default keys, see inconshreveable/log15#4. So with gokit/log I took some inspiration from package context which uses interface{} keys for reasons explained in the blog post. The main benefit is that we can define unambiguously unique keys for things like levels and timestamps or other things we haven't thought of yet.

Having keys that cannot accidentally conflict with keys from other packages allows handlers to confidently operate on the known key and still let apps make the final call on how those keys are emitted to the backend, thus the ReplaceKeys handler.

Thanks for the context. I have a few thoughts here that I'm struggling to put into a coherent narrative, so please forgive me as I just kind of dump them out :)

  • I'm concerned about the performance cost of N type assertions for every Log invocation. I own and operate services that can become CPU bound based on log output even without this additional penalty.
  • Relatedly, I'm not sure why KeyVal is a sort of blessed interface type, which receives special treatment in the core logger implementation.
    • If we have label-pairs (keyvals) which are mandatory, my intuition is they should be explicit, typed parameters to either the constructor or Log method. But I don't see a use case which requires that.
    • Rather, I would expect package log to treat all label-pairs (keyvals) as user-domain and operate on them blindly, without introspection. But it could be that I'm just not grokking the use case here; please feel free to educate me.
  • If KeyVal and the type assertions go away, then so too disappears the concept of "default keys", indeed explicit keys consts like LvlKey, TimeKey altogether. And the problem is defined out of existence :)

Perhaps I'm thinking too naïvely, but I expected an API like

type Logger interface {
    With(Field) Logger
    Log(format string, args ...interface{})
}

type Field struct {
    Key, Value string
}

Conceptually a Handler filters and routes structured log records in key-value form. Handlers compose with other Handlers to form arbitrarily sophisticated logging pipelines. An Encoder converts a structured log record to unstructured form ([]byte). Encoders will only appear at the end of the logging pipeline.

I see. Do we actually want "arbitrarily sophisticated logging pipelines" within a single process -- especially encoded into the base type definitions? I know people who'd consider that to be an antipattern. Could those semantics rather be expressed as an opt-in abstraction layer? For example, taking my type Logger interface from above, if we wanted to implement a Splitting and Filtering logger, we could do...

func Split(next1, next2 Logger) Logger {
    return split{next1, next2}
}

func (s split) Log(format string, args ...interface[}) {
    s.next1.Log(format, args...)
    s.next2.Log(format, args...)
}

func Filter(matchRegex string, next Logger) Logger {
    return filter{regexp.MustCompile(matchRegex), next}
}

func (f filter) Log(format string, args ...interface{}) {
    if !s.re.Match(fmt.Sprintf(format, args...)) {
        return
    }
    f.next.Log(format, args...)
}

Sorry again for my rambling nature, and thanks in advance for your feedback. I'd also like to hear opinions from @taotetek and @tsenart on this...

@ChrisHines
Copy link
Member Author

I'm concerned about the performance cost of N type assertions for every Log invocation.

I agree that it is important for Log to have minimal overhead. I did some benchmarking, and the type assertions are not a problem, but memory allocations are. I was surprised how many allocations were happening. I now see the impact of this line from the Go 1.4 release notes: "...as of 1.4 interface values always hold a pointer ... programs that store integers (for example) in interfaces will see more allocations."

Relatedly, I'm not sure why KeyVal is a sort of blessed interface type ...

I'm not sure if that is a good idea yet either. That is the biggest experiment in this sketch.

I would expect package log to treat all label-pairs (keyvals) as user-domain and operate on them blindly, without introspection.

Can you explain what you consider the boundaries of the user-domain from gokit's perspective? In the context of the log API I've sketched out, I consider the Handlers and Encoders part of the user-domain because the main program has control over them, injects them into the Loggers via SetHandler(), and in fact may provide custom Handlers from outside package log.

More concretely, Gokit RFC004 says, "Log SHALL include severity as a k/v pair and allow setting it through the same mechanism as any other k/v pair." If an application wants to filter log records based on severity then introspecting the severity value in the k/v set seems the most obvious approach to me (see log15.LvlFilterHandler. Which leads to the problem of finding the severity without having some reserved keywords that callers to Log must avoid. That led me to create the explicit key consts. At that point a typical call to Log might look like this:

l.Log(log.LvlKey, log.Info, "msg", "Hello, world!")

Which seems rather verbose, so the introduction of the (admittedly magical) KeyVal interface allows for:

l.Log(log.Info, "msg", "Hello, world!")

Which isn't so bad, and actually not far from the typical l.Info("msg", "Hello, world!") of other logging packages. But it has the benefit of making severity levels optional when not doing diagnostic logging.

Perhaps I'm thinking too naïvely, but I expected an API like ... [see above for API].

What would a call sequence equivalent to mine above look like with that API?

Do we actually want "arbitrarily sophisticated logging pipelines" within a single process -- especially encoded into the base type definitions? I know people who'd consider that to be an antipattern.

I am having trouble parsing the phrase "encoded into the base type definitions" in this context, please explain.

The logging pipelines are completely under application control and may do anything from discard every log record, to asynchronously write to a network end point, with fail-over to local disk on error. One of my favorite Handlers from log15 for diagnostic logging is the EscalateErrHandler, which elevates the severity of a log record to Error if any of the k/v pairs has a non-nil error value (the link has more details and a code example). It is especially powerful when combined with the LvlFilterHandler mentioned above.

My goal is for a structured log API with composability on par with package io or net/http. I think it would be strange to consider those packages as antipatterns.

For example, taking my type Logger interface from above, if we wanted to implement a Splitting and Filtering logger, we could do...

It seems to me that approach combines the duties of the Logger and the Handler. It could work. It may pose a challenge handling a common requirement in diagnostic logging—although I didn't include it in my PR yet—to collect the source file and line number (or the stack trace) for each log record. Once you start wrapping Loggers around each other it becomes tricky to know how many layers up the stack to go before finding the last application stack frame. A thin Logger layer makes it easy to control the stack depth for the calls to Log so that we can pass a constant integer to runtime.Caller(). The logger then defers the layers of abstraction to the Handler chain in the same fashion that you have done with the Loggers.

@peterbourgon
Copy link
Member

I would expect package log to treat all label-pairs (keyvals) as user-domain and operate on them blindly, without introspection.

Can you explain what you consider the boundaries of the user-domain from gokit's perspective?

I meant only that it seemed like a violation of bounded context to have package log introspect on user-provided keyvals. But, as you rightly point out,

Log SHALL include severity as a k/v pair and allow setting it through the same mechanism as any other k/v pair.

creates an opportunity for confusion. The intention there, I believe, was to have contractual enumerations for severity that can be relied upon within gokit itself. But I anticipated that would take the form of

var (
    SeverityKey   = "severity"
    SeverityDebug = "debug"
    SeverityInfo  = "info"
    SeverityError = "error"
)

and be utilized like

func (t type) internalGokitFunction() {
    t.logger.With(log.SeverityKey, log.SeverityDebug).Log("something's happening") // or...
    t.debug.Log("something's happening") // t.debug being a pre-contextified log.Logger
}

What would a call sequence equivalent to mine above look like with that API?

(I believe I answer that with the above code excerpt. Please let me know if anything is still unclear.)

I consider the Handlers and Encoders part of the user-domain because the main program has control over them, injects them into the Loggers via SetHandler() . . .

My intuition is that handling should be composed from, rather than injected into, Loggers. You get at my meaning here, too:

It seems to me that approach combines the duties of the Logger and the Handler.

Indeed I don't (presently) understand why there should be a distinction between Loggers and Handler. They're both things that implement Log. Maybe you mean to parallel http.ServeMux and http.Handler?

It could work. It may pose a challenge handling a common requirement in diagnostic logging—although I didn't include it in my PR yet—to collect the source file and line number (or the stack trace) for each log record. Once you start wrapping Loggers around each other it becomes tricky to know how many layers up the stack to go before finding the last application stack frame. A thin Logger layer makes it easy to control the stack depth for the calls to Log so that we can pass a constant integer to runtime.Caller(). The logger then defers the layers of abstraction to the Handler chain in the same fashion that you have done with the Loggers.

That's fair. I'm biased toward functional composition whenever possible, and prefer to start there and be forced to concede to other methods when necessary. Of course — e.g. w/ threading context.Context through request handlers — sometimes it's necessary.

@peterbourgon peterbourgon changed the title First sketch of basic logging features. package log Mar 17, 2015
@peterbourgon
Copy link
Member

I've drafted a quick summary of my ideas about package log over in #21

@peterbourgon
Copy link
Member

#21 is merged so we'll close this one—thanks for the comments and feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants