Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scraper error? #5

Open
tschrist opened this issue May 24, 2021 · 10 comments
Open

scraper error? #5

tschrist opened this issue May 24, 2021 · 10 comments

Comments

@tschrist
Copy link

2021-05-24 09:32:27 [scrapy.core.scraper] ERROR: Spider error processing <GET https://www.goodreads.com/giveaway?sort=recently_listed&tab=recently_listed> (referer: https://www.goodreads.com/)
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/scrapy/utils/defer.py", line 102, in iter_errback
yield next(it)
File "/usr/lib/python3/dist-packages/scrapy/core/spidermw.py", line 84, in evaluate_iterable
for r in iterable:
File "/usr/lib/python3/dist-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output
for x in result:
File "/usr/lib/python3/dist-packages/scrapy/core/spidermw.py", line 84, in evaluate_iterable
for r in iterable:
File "/usr/lib/python3/dist-packages/scrapy/spidermiddlewares/referer.py", line 339, in
return (_set_referer(r) for r in result or ())
File "/usr/lib/python3/dist-packages/scrapy/core/spidermw.py", line 84, in evaluate_iterable
for r in iterable:
File "/usr/lib/python3/dist-packages/scrapy/spidermiddlewares/urllength.py", line 37, in
return (r for r in result or () if _filter(r))
File "/usr/lib/python3/dist-packages/scrapy/core/spidermw.py", line 84, in evaluate_iterable
for r in iterable:
File "/usr/lib/python3/dist-packages/scrapy/spidermiddlewares/depth.py", line 58, in
return (r for r in result or () if _filter(r))
File "/root/Goodreads-Giveaway-BOT/goodreads/spiders/giveaway.py", line 106, in giveaway_pages
pages_list.pop()
IndexError: pop from empty list

@DanielSmith1239
Copy link

I'm having this issue as well.

@DanielSmith1239
Copy link

I believe it's caused by goodreads making some modifications to their website so the bot can't parse the pages.

@DanielSmith1239
Copy link

@kaushikthedeveloper just forked and made some changes that fixes the issue: https://github.com/DanielSmith1239/GoodreadsMaster

@kaushikthedeveloper
Copy link
Owner

Hey folks. Am happy to see this project is still running. While I created this project for personal usage and for anyone else to try it out, since Goodreads stopped Giveaways across India (basically anywhere outside USA), I stopped looking into this.

Will check it out and give it a try once more to keep it up to date.

@DanielSmith1239 , awesome fork buddy. Will take a look.

@danielkadosh10
Copy link

@DanielSmith1239 I know this isnt your fork but I couldn't find a way to comment on your fork, but the bot is not working currently could you maybe fix it?

@DanielSmith1239
Copy link

@DanielSmith1239 I know this isnt your fork but I couldn't find a way to comment on your fork, but the bot is not working currently could you maybe fix it?

Sure I’ll take a look, thanks for letting me know. I have the bot automated so I didn’t realize it wasn’t working. Do you have any info that might help me pinpoint the issue?

@danielkadosh10
Copy link

@DanielSmith1239 yes I'm not that good at network but from what I could gather their original sign in page was was https://goodreads.com/user/sign_in that's where you made the form request but now you need to press log in with email on that page and it takes you to another sign in page

@DanielSmith1239
Copy link

@DanielSmith1239 yes I'm not that good at network but from what I could gather their original sign in page was was https://goodreads.com/user/sign_in that's where you made the form request but now you need to press log in with email on that page and it takes you to another sign in page

Yup it should be all fixed now! That was the main issue, and they changed how the authentication worked behind the scenes (I think) so I had to add some cookie stuff. I learned a lot more about scrapy today lol

@danielkadosh10
Copy link

@DanielSmith1239 cool thank you, if you dont mind me asking but where did you learn how to really use scrapy

@DanielSmith1239
Copy link

Uhhh Google pretty much lol, I wouldn’t really say I know how to use it. I just kept googling “how to do x with scrapy”, which is how I learn most of my coding stuff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants