-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add Scraper/ScraperRun classes #7
base: master
Are you sure you want to change the base?
Conversation
First stab at factoring out a Scraper class and a ScraperRun class. The IndexToMemebers class is probably overkill for now — we can find the higher level abstractions like that after we nail down the basics — but it's probably useful to include in this intial WIP which is more for feedback than expected to actually merge yet.
407b363
to
598e59b
Compare
Part of everypolitician/everypolitician#572. |
The |
This gives us the full stack trace and ensures that the program actually exits after an error.
This was trying to call `#keys` on an array, which was raising an error. For now just use the first item from the data array.
Not all records seem to have a start date, so removing this from the list of default index fields for now.
This doesn't need to be a public method as it's only used internally.
Rather than trying to find the correct generic abstraction this forces the scrapers to have a class that encapsulates navigating a website to get the correct content. In the future we can potentially wrap these up in `IndexToMembers` type classes, but for now it seems easier to just do scraper specific ones until we find the correct abstraction.
Thinking about this a little more, I fear that we got a key underlying concept back to front here. Here we say that a I now think this should all be the other way around: a |
First stab at factoring out a Scraper class and a ScraperRun class.
The IndexToMemebers class is probably overkill for now — we can find the
higher level abstractions like that after we nail down the basics — but
it's probably useful to include in this intial WIP which is more for
feedback than expected to actually merge yet.