Skip to content

Case Study: basic digital archive

Chris Beer edited this page Mar 25, 2015 · 3 revisions

In this scenario, a group wants to assemble and present photographs, primary documents, newspaper clippings, and oral and written narratives. They have some material already created in collection management software, but expected to add additional material directly into the application.

Getting started

To quickly evaluate spotlight, they use Docker and the docker-spotlight bundle to spin up a new Spotlight application. The generated application is bare-bones and only supports item uploads with exhibit-specific fields.

screen shot 2015-03-25 at 8 07 09 am

Adding exhibit-specific fields

In addition to a file and a title, (out-of-the-box) Spotlight offers three default fields for exhibit-specific items: description, attribution and date.

screen shot 2015-03-25 at 8 10 31 am

The curator decides two additional fields would be useful for evaluating Spotlight. They create a field for "Narrative type", and make it a "controlled vocabulary" field. This will allow the value to show up as facets on the home page and as part of the search experience.

screen shot 2015-03-25 at 8 12 09 am

screen shot 2015-03-25 at 8 13 50 am

The curator also adds a location field, leaving it as a free-form text field for now, and hoping to automatically process the data into machine-readable geographies in the future. Finally, they add several Dublin Core-inspired fields to hold data from their existing management database:

screen shot 2015-03-25 at 8 22 23 am

Bulk loading as CSV

After creating the exhibit-specific fields, the curator downloads the basic CSV template. The template has columns for every field and a URL field to point at remote images.

multiitemupload

The existing database provided an Atom feed with content for the objects. A ruby developer quickly made a script to convert the feed into the CSV template format:

doc = open('http://stanislausriver.org/items/browse?output=atom').read
ns = { atom: "http://www.w3.org/2005/Atom" }

h = Nokogiri::XML(doc).xpath('//atom:entry', ns).map do |x| 
  content = Hash[Nokogiri::HTML(x.xpath("atom:content", ns).text).css('div').map do |d| 
      [d.css('h3').text, d.css('.element-text').text] 
  end]; 

  { 
    full_title_tesim: x.xpath('atom:title', ns).text, 
    url: x.xpath('atom:link[contains(@rel, "enclosure")]/@href', ns).text,
    spotlight_upload_attribution_tesim: content['Rights'],
    spotlight_upload_date_tesim: content['Date'],
    "exhibit_default-exhibit_creator_ssim" => content['Creator'],
    "exhibit_default-exhibit_source_ssim" => content['Source'],
    "exhibit_default-exhibit_type_ssim" => content['Type']
  }
end

puts CSV.generate { |csv| csv << h.first.keys; _.each { |x| csv << x.values } }

After uploading the CSV spreadsheet, Spotlight processed the request in the background and created new objects for each of the rows. Spotlight also downloaded each of the URLs and processed them to create derivative thumbnails (400px long edge, 100x100 square).

screen shot 2015-03-25 at 8 47 11 am

These items now show up in the search results, and have an openseadragon image viewer:

screen shot 2015-03-25 at 8 57 50 am

Single item upload

The curator also has additional items to load, and uses the single item form to add those to Spotlight:

screen shot 2015-03-25 at 8 48 52 am

Although the out-of-the-box form is sufficient for testing Spotlight, the curator is interested in customizing this form in the future, or automatically connecting the existing database with the Spotlight exhibit. The Spotlight wiki documents some of these Resource Scenarios.