Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception handling for feed fetching #106

Closed
jeroenh opened this issue Dec 23, 2016 · 4 comments
Closed

Exception handling for feed fetching #106

jeroenh opened this issue Dec 23, 2016 · 4 comments

Comments

@jeroenh
Copy link

jeroenh commented Dec 23, 2016

The current feed fetching implementation does not catch exceptions. Over the past few days I've had a couple of socket.timeout and OpenSSL.SSL.SysCallError: (54, 'ECONNRESET')exceptions happening. Crontab happily emails me about this fact, but there is absolutely no context other than a stack trace. This means that it is impossible to pin down which feed is causing this. It would be great if there were a way to add this.

Looking at the code, the best place to catch the exception is probably in the update_feed() call at coldsweat/controllers.py:306. But that doesn't seem part of the stack trace. Line coldsweat/controllers.py:176 is an element, but at that point it is does not seem easy to track which feed was causing the exception.

Traceback (most recent call last):
 File "/usr/local/www/coldsweat/sweat.py", line 12, in <module>
   run()
 File "/usr/local/www/coldsweat/coldsweat/commands.py", line 252, in run
   cc.run_command(command_name, command_options, command_args)
 File "/usr/local/www/coldsweat/coldsweat/commands.py", line 57, in run_command
   handler(options, args)        
 File "/usr/local/www/coldsweat/coldsweat/commands.py", line 120, in command_refresh
   self.fetch_all_feeds()
 File "/usr/local/www/coldsweat/coldsweat/controllers.py", line 301, in fetch_all_feeds
   self.fetch_feeds(feeds)
 File "/usr/local/www/coldsweat/coldsweat/controllers.py", line 276, in fetch_feeds
   p.map(feed_worker, feeds)
 File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 251, in map
   return self.map_async(func, iterable, chunksize).get()
 File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 567, in get
   raise self._value
OpenSSL.SSL.SysCallError: (54, 'ECONNRESET')
@passiomatic
Copy link
Owner

passiomatic commented Dec 24, 2016

Thank you for our report. HTTPS feed fetching is a recent addition for me since I didn't have an updated Python version on the production machine. I managed to update Python to version 2.7.12 and now I have a couple of feed served via HTTPS with no issues.

Back to your issue: fetcher code uses Requests library to handle connections which is very robust — it catches a lot of connection issues — and I wrap the fetch request around a RequestException and log it as network errors. You typically see these lines on the log:

[2016-08-19 00:00:14,915] 23346 WARNING a network error occurred while fetching bost.ocks.org, skipped

Looking at your stack trace: the feed worker calls fetcher.update_feed() so we are on the right track (I have now idea why it doesn't show up on the stack trace, though).

So the error originates from the SSL layer and should be wrapped by Requests. Of course googling for that exception gives us a myriad of bug reports, e.g.: DocNow/twarc#72 I'll go thru the linked discussion to figure out what's going on.

The quick fix you could try is to catch everything on the https://github.com/passiomatic/coldsweat/blob/master/coldsweat/fetcher.py#L148 line. Doing this you should get an error log about the feed causing the connection error. When you have figured the offending feed please share it here, so I can do some testing it on my machine.

@jeroenh
Copy link
Author

jeroenh commented Jan 10, 2017

Okay, I added the exception handling in the right place, but since then the error has not occurred anymore.

I have a feeling the error may be caused by osx.iusethis.com for which I had a feed in ancient history and it just migrated along with my other feeds to coldsweat.

The feed entry is stuck in limbo now: it migrated, but has never worked since then. Now it doesn't show up in the feed overview, so I can't remove it.

@passiomatic
Copy link
Owner

passiomatic commented Jan 11, 2017

In fact osx.iusethis.com seems dead (timeout) and that could cause the connection error you have seen.

In the meantime I've read the discussion previously linked to that OpenSSL exception and that boils down to the fact that Requests (or better urllib3) may use that - if installed - to handle SSL connections. Requests doesn't catch OpenSSL.SSL.SysCallError since it expects urllib3 to do that. I will be happy to update the version of Requests dependency once this thing will be fixed.

In Coldsweat makes little sense to catch that, because OpenSSL isn't even listed as project dependency.

The feed entry is stuck in limbo now: it migrated, but has never worked since then. Now it doesn't show up in the feed overview, so I can't remove it.

This is weird because even if the feed is disabled or has zero entries it should be assigned to your user and hence be displayed in the feed list panel. If it has no users assigned it is effectively in limbo waiting to some user to add it to her feeds. This seemed a smart thing to down at the time but it turned out to be counterproductive. There a plan to fix this, see #63.

@passiomatic
Copy link
Owner

passiomatic commented Mar 25, 2017

I'm gonna close this since it was not reproducible by reporter anymore nor on my machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants