AttributeError: Can't get attribute 'PythonSpider' on <module 'main' (built-in)> #13

tmancini · 2021-09-12T13:40:44Z

Hey all, this is exactly what I was looking for, but running into a few problems trying to test it out on Windows. Using the following I get the error above:

import scrapy
from scrapyscript import Job, Processor

processor = Processor(settings=None)


class PythonSpider(scrapy.spiders.Spider):
    name = "myspider"

    def start_requests(self):
        yield scrapy.Request(self.url)

    def parse(self, response):
        data = response.xpath("//title/text()").extract_first()
        return {'title': data}


job = Job(PythonSpider, url="http://www.python.org")
results = processor.run(job)

print(results)

When I move the Spider into a separate file and import that in, it seems to run without an error, but the results print as an empty array.

import scrapy
from scrapyscript import Job, Processor

from PythonSpider import PythonSpider

settings = scrapy.settings.Settings(values={'LOG_LEVEL': 'WARNING'})
processor = Processor(settings=settings)


job = Job(PythonSpider, url="http://www.python.org")
results = processor.run(job)

print(results)

The text was updated successfully, but these errors were encountered:

bsekiewicz · 2021-09-20T19:57:46Z

It seems that _item_scraped is not triggered, so dispatcher in Processor.__init__() doesn't work. (???)

The temporary solution is moving dispatcher.disconnect(self._item_scraped, signals.item_scraped) from __init__ to crawl in Processor class. Then comment p.terminate() line in run due to some billiard library (win32) issues.

In general, it seems to be something wrong with this library on windows :(

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: Can't get attribute 'PythonSpider' on <module 'main' (built-in)> #13

AttributeError: Can't get attribute 'PythonSpider' on <module 'main' (built-in)> #13

tmancini commented Sep 12, 2021 •

edited

Loading

bsekiewicz commented Sep 20, 2021 •

edited

Loading

AttributeError: Can't get attribute 'PythonSpider' on <module '__main__' (built-in)> #13

AttributeError: Can't get attribute 'PythonSpider' on <module '__main__' (built-in)> #13

Comments

tmancini commented Sep 12, 2021 • edited Loading

bsekiewicz commented Sep 20, 2021 • edited Loading

AttributeError: Can't get attribute 'PythonSpider' on <module 'main' (built-in)> #13

AttributeError: Can't get attribute 'PythonSpider' on <module 'main' (built-in)> #13

tmancini commented Sep 12, 2021 •

edited

Loading

bsekiewicz commented Sep 20, 2021 •

edited

Loading