-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log timestamps in l2met #76
Comments
Yikes. Would love to know what is causing the panic for sure! I would love to accept a PR that improves l2met's own visibility. Maybe there is something in the log pkg that we can use. |
Just an update, runs for a few minutes on production data before crashing. Using the log package I added timestamps to the log and got a second to check router messages within, this resulted in between 4 - 20 candidates. Ran like 40 message strings through the receiver unit test and couldn't get it to die. So I commented out this line https://github.com/ryandotsmith/l2met/blob/master/bucket/parser.go#L72 and it has been running for about 6hrs. Will dig into the log package and see if there is a nice way of capturing the tuple with it's state and dumping this on a panic then do some tests with parallel file of log data. |
So did some research, one obvious issue so far is the string convert of the Float64 only caters 10 digits, when your handling values which are in bytes and milliseconds this can overflow. Don't ask me how but we have clients on very slow links uploading files for a very long time. Either way I have had a shot at rewriting that logic and removed the parse/format/compare/slice with a TrimFunc, once I am happy with the code I will make a pull request for you to review. Code I am talking about is here. https://github.com/ryandotsmith/l2met/blob/master/bucket/log_tuple.go#L56-L65 Note this also marginally faster. |
👏 |
Hello, I wanted to follow up on this. A few things have changed with respect to internally logged metrics. You can read about the metchan in #82, but I thought it would be nice if I pointed out that we can easily accommodate the issue you have raised here with metchan. Currently all logging output happens here. We might be able to use a better logging facility that would support logging timestamps. |
Ah great to hear, so far I haven't experienced this failure with my fork which includes the changes I pushed to you. I need to re-sync up with your tree and catch up on the changes looks like lots of things are happening. I will take a look at how this can be extended to cater for local logging. |
There seems to be an issue with something in the router messages which is triggering a panic.
Because I have no timestamps in my log it is difficult to correlate with the raw syslog data I am collecting.
I understand this is not necessary if deployed on Heroku, would nice to add a flag to enable it for development and where not hosted there.
Happy to submit a pull request if your conducive to the idea.
Also here is the error message logged.
The text was updated successfully, but these errors were encountered: