This is a script that is meant to make the CSV reports generated by ArchivesSpace more readable in some scenarios.
The CSV reports generated by ArchivesSpace in most cases show linked records as a list of JSON objects. This looks something like the following.
[{"name":"Halpern, Elizabeth","function":"creator","role":null},{"name":"Halpern, Elizabeth","function":"subject","role":null}]
This tool creates a separate row for each JSON object in the list and makes a new column for each key in the JSON object. This should allow for increased sorting ability.
The following is an example of how this would look for one row of report data.
accession_number | record_title | deaccessions | names |
---|---|---|---|
*2010M-1 | Annotation by A. L. (Alfred Leslie) Rowse on clipping | [{"description":"It's gone, man.","notification_sent":"Yes","date":"2018-04-06","extent":"0.00 "}] | [{"name":"Halpern, Elizabeth","function":"creator","role":null},{"name":"Halpern, Elizabeth","function":"subject","role":null}] |
accession_number | record_title | deaccessions_date | deaccessions_notification_sent | deaccessions_extent | deaccessions_description | names_name | names_function | names_role |
---|---|---|---|---|---|---|---|---|
*2010M-1 | Annotation by A. L. (Alfred Leslie) Rowse on clipping | 4/6/2018 | Yes | 0 | It's gone, man. | Halpern, Elizabeth | creator | |
*2010M-1 | Annotation by A. L. (Alfred Leslie) Rowse on clipping | 4/6/2018 | Yes | 0 | It's gone, man. | Halpern, Elizabeth | subject |
By default, all columns with JSON data will be expanded. However, you can specify columns not to expand through the option -x
or --skip-column
. The need for this is explained below.
When there are multiple types of linked records on a row, this tool will generate the cross product (i.e. every possible combination of data). This means that you will have a lot of duplicate data if you use this tool for a report that has many linked records of different types and choose to expand all of them. For instance, if you had an accession that is linked to 2 deaccessions, 4 subjects, and 3 names, you would end up with 2x4x3=24 rows in your expanded CSV report. Its easy to see how this could get exessive quickly.
This is why this tool includes an option to specify columns not to expand. If, in the example above, you only really needed to be able to sort by the data for the linked subjects, you could choose not to expand the deaccessions and names columns so that you only ended up with 4 columns for that accession in your report.
With python installed, run python expand.py path/to/my_report.csv
.
This will generate path/to/my_report_expanded.csv
.
A setup file is also provided that can be used to generate an exe with py2exe with python setup.py py2exe
.
This exe can be run like expand path/to/my_report.csv
.