You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
more of the agencies might have matched if the full state name were replaced with the two-letter ISO code ("Massachusetts" → "MA")
We should try to assign a record_type. For the use of force CSV, some of them were Use of Force Reports and some of them were Policies & Contracts. I set everything with "policy" or "policies" in the name to Policies & Contracts and everything else to Use of Force Reports.
Because policies are general and may apply to many record types, my idea is that we check for the presence of "policy" / "policies", then fall back to the searched-for type of record.
I added a generic description "A completed MuckRock records request." to all of them, but we can probably do slightly better. These can be brief and aim to just sum up.
for speed and efficiency of API usage, we could also save a log of URLs we've collected in the past in the repo
We could also include an optional property of submitter_contact_info which contains an email; that way people get credit for their work! If we use automation in the future, we can use something like [email protected] so we have that kind of clue.
the CSV could have a simpler schema:
name
agency_described
record_type
description
source_url
readme_url
agency_supplied
supplying_entity
agency_originated
originating_agency
access_type
data_portal_type
The text was updated successfully, but these errors were encountered:
My first priority with this is to do some refactoring, both to clean up the scraper as well as to give myself a better understanding of the ins and outs of the logic.
Context
Requirements
record_type
. For the use of force CSV, some of them wereUse of Force Reports
and some of them werePolicies & Contracts
. I set everything with "policy" or "policies" in the name toPolicies & Contracts
and everything else toUse of Force Reports
.submitter_contact_info
which contains an email; that way people get credit for their work! If we use automation in the future, we can use something like [email protected] so we have that kind of clue.The text was updated successfully, but these errors were encountered: