Skip to content

Latest commit

 

History

History
415 lines (318 loc) · 32.1 KB

index.md

File metadata and controls

415 lines (318 loc) · 32.1 KB

Critical Perspectives in Cultural Data Analysis

University of Texas at Austin School of Information

Spring 2018, Mondays 3–6 p.m. UTA 1.210A

Instructor: Tanya Clement

Office hours: Mondays 1–3 p.m., UTA 5.558

Course Schedule

Week 1 | Week 2 | Week 3 | Week 4 | Week 5 | Week 6 | Week 7 | Week 8 | Week 9 | Week 10 | Week 11 | Week 12 | Week 13 | Week 14

Course Objectives

Prerequsites: advanced-level undergraduate or graduate coursework in the humanities; no or very little programming experience preferred;

In the data, information, knowledge, wisdom (DIKW) hierarchy that circulates through Knowledge Management (KM) and Information Science (IS) discussions, data appears at the base of a pyramid of which wisdom is the pinnacle. In this schematic, data is “raw” and lacking in meaning, while information, the next higher level of the pyramid—just below knowledge and then wisdom—represents the presence of added links and relationships; information is higher up on the wisdom chain because it is data made meaningful. In the humanities, students are taught that data is not found in the “raw” but has rather been cooked all along, taken and constructed and seasoned according to our situated contexts including access issues (Where is the data?); media, format, and technology constraints (How is the data?); and perspectives (What is the data? Who is involved in and impacted by its creation and use?).

Learning to think critically about data as information means rejecting common illusions about data more generally, including its objectivity, impersonality, atemporality, and authorlessness. To teach students to think about information from this more critical perspective means first understanding how a culture tends to understand what is informative.

The aim of this course is to encourage students to generate high quality scholarship that applies computational and quantitative methods to the study of cultural artifacts (text, image, sound) at significantly larger scales than traditional methods. The final research paper is expected to combine critical theory, computational methods, and grounding in a particular humanities field towards the crafting of novel, thought-provoking arguments in the humanities.

Towards these ends, this course takes on “data wrangling” in the context of humanist perspectives.

Learning goals:

  • Exploring the cultural implications of large-scale data analysis with cultural materials.

  • Writing using perspectives in critical data studies;

  • Gaining familiarity with scripting-style programming in Python and Unix-like systems with an emphasis on gaining critical perspectives on the use of freely available data sets in the humanities and on free and open source software; in techniques for collecting, transforming, and analyzing media and metadata available on the Web; of commonly used data models and their standard formats, including CSV, JSON, and XML; of text analysis techniques such as natural language processing (NLP), sentiment analysis, and machine learning classification; and with tools for analyzing cultural data via visualization and statistical tests

Course Principles

  • Writing critically about data requires both a level of knowledge about data and data wrangling as it requires a level of knowledge about thinking and writing from critical perspectives learned in cultural studies. While this course does not teach cultural studies per se, an understanding of and experience in humanities theory and research and the principles of cultural studies are essential for success in the course.

  • Imitating and modifying others’ code is essential in learning to program. You can many examples and explanations on Stack Exchange and similar online forums. Taking one or two lines without attribution is OK; if you use a longer chunk of code found online, add a #comment with the source’s URL.

  • Begin assignments early. If you realize what you had in mind is more difficult than expected, talk to the instructor about choosing an alternative.

  • We’ll be focusing on a scripting approach to programming. This course is not oriented toward developing large, complex programs or writing perfectly optimized code.

  • Learning to code takes trial and error. Work through weekly programming tutorials before class and continue polishing in-class coding assignments at home.

Course materials

There is one required text for this course:

Montfort, Nick. Exploratory Programming for the Arts and Humanities. Cambridge, MA: The MIT Press, 2016.

All other readings will either be available online and linked below or posted on Canvas.

Assignments


I. Cultural Data Analysis

Week 1 (1/22): Introduction to Cultural Data Analysis

Readings

  • danah boyd & Kate Crawford (2012) "Critical Questions for Big Data," Information, Communication & Society, 15:5, 662-679.
  • Piper, Andrew. "There will be Numbers." Journal of Cultural Analytics 1, no. 1 (May 23, 2016). http://culturalanalytics.org/2016/05/there-will-be-numbers/
  • Bod, Rens. "Introduction: the Quest for Principles and Patterns." A New History of the Humanities: The Search for Principles and Patterns from Antiquity to the Present. Oxford University Press, 2013, pp. 1 - 12. Note: You must be logged in as a UT student to retrieve this text: http://catalog.lib.utexas.edu/record=b8902003~S29

Week 2 (1/28): Provocations

Readings

Coding, writing, and exercises

For discussion

  1. Manovich, Lev. 2008. “The Next Big Thing in Humanities, Arts, and Social Science Computing: Cultural Analytics”. HPC Wire, July 29.
  2. Manovich, Lev. 2009. “Cultural Analytics: Visualizing Cultural Patterns in the Era of ‘More Media.’"" Software Studies Initiative website).
  3. Hall, Gary. “Toward a Postdigital Humanities: Cultural Analytics and the Computational Turn to Data-Driven Scholarship.” American Literature 85, no. 4 (January 1, 2013): 781–809.

Optional

  • Kenner, Hugh. Sentences. Harvard Book Review No. 13/14 (Summer - Fall, 1989), pp. 3-4.
  • Posner, Miriam. “Humanities Data: A Necessary Contradiction.” Miriam Posner’s Blog, June 25, 2015. http://miriamposner.com/blog/humanities-data-a-necessary-contradiction
  • Gallinger, M. and Daniel Chudnov "Library of Congress Lab: Library of Congress Digital Scholars Lab Pilot Project Report."

Assignment

Discussion post


Week 3 (2/5): Programming

Readings

Coding, writing, and exercises

For discussion

Assignment

Discussion post


Week 4 (2/12): Data

Readings

Coding, writing, and exercises

For discussion

  • Borgman, chp 2 "What are Data?"
  • Krumme, Coco. “What Data Doesn’t Do.” In Beautiful Data: The Stories behind Elegant Data Solutions, edited by Toby Segaran and Jeff Hammerbacher, 1st ed. Beijing ; Sebastopol, CA: O’Reilly, 2009.
  • Rosenberg, "Data Before the Fact." In Gitelman, Lisa "Raw Data" is an Oxymoron. Cambridge: MIT Press, 2013.

Optional

Assignment

Discussion post

  • OpenRefine

Week 5 (2/19): Data Scholarship

Readings

Coding, writing, and exercises

For discussion

  1. Winner, Langdon. “Do Artifacts Have Politics?” Daedalus 109, no. 1 (1980): 121–36.
  2. Joerges, B. “Do Politics Have Artefacts?” Social Studies of Science 29, no. 3 (June 1, 1999): 411–31.
  3. Sacasas, Michael. “Do Artifacts Have Ethics?” The Frailest Thing, November 29, 2014. http://thefrailestthing.com/2014/11/29/do-artifacts-have-ethics

Assignment

REQUIRED Discussion post, 4 points

Speaker: Maria Fernandez


Week 6 (2/26): Data Set Reviews

In class presentations of Data Set Reviews

Assignment

Data Set Review

  • Downloading with Wget
  • Fetching and Parsing Data from the Web with OpenRefine

II. Interpretive Framing with Data

Week 7 (3/5): Audience

Readings

Coding, writing, and exercises

For discussion

  • Borgman chp. 4 "Data Diversity"
  • Hitchcock, Tim. “Digital Searching and the Re-formulation of Historical Knowledge” 2008. In The Virtual Representation of the Past, edited by Mark Greenglass and Lorna Hughes, 81-90. Ashgate: 2008.
  • Piper, A. Think Small: On Literary Modeling. PMLA, Volume 132, Number 3, May 2017, pp. 651–658.
  • Pound, Scott. “Kenneth Goldsmith and the Poetics of Information.” PMLA, vol. 130, no. 2, Mar. 2015, pp. 315–30.

Optional

Assignment

Discussion post

  • Getting Files
  • Revisiting the basics

SPRING BREAK (3/12)


Week 8 (3/19): Open Access

Readings

Coding, writing, and exercises

For discussion

  • Christen, Kim. “Does Information Really Want to be Free? Indigenous Knowledge Systems and the Question of Openness.” International Journal of Communication 6 (2012), 2870–2893.
  • Day, Ronald E. “Governing Expression: Social Big Data and Neoliberalism.” In Indexing It All: The Subject in the Age of Documentation, Information, and Data, 123–44. History and Foundations of Information Science. Cambridge, Massachusetts: The MIT Press, 2014.
  • Pomerantz, Jeffrey. “The Future of Metadata.” In Metadata. The MIT Press Essential Knowledge Series. Cambridge, MA ; London, England: The MIT Press, 2015.

Optional

  • Peters, Justin. The Idealist: Aaron Swartz and the Rise of Free Culture on the Internet, Chapters 7 and 8. New York: Scribner, 2016.
  • O’Sullivan, Michael. “Aaron Swartz, New Technologies, and the Myth of Open Access.” In Academic Barbarism, Universities and Inequality. Palgrave Critical University Studies. Houndmills, Basingstoke, Hampshire ; New York, NY: Palgrave Macmillan, 2016.

Assignment

Discussion post

Speaker: Maria Fernandez

  • CSV Input/Output in Python

Week 9 (3/26): Data Modeling

Readings

For discussion

  • Kreiss, D., M. Finn, and F. Turner. “The Limits of Peer Production: Some Reminders from Max Weber for the Network Society.” New Media & Society 13, no. 2 (March 1, 2011): 243–59.
  • Swartz, Aaron. “Building a Platform: Providing APIs.” In Aaron Swartz’s ‘A Programmable Web’: An Unfinished Work, 31–39. San Rafael, CA: Morgan & Claypool Publishers, 2013.
  • van Hooland, Seth, and Ruben Verborgh. “Modelling.” In Linked Data for Libraries, Archives and Museums: How to Clean, Link and Publish Your Metadata, 11–70. Chicago: Neal-Schuman, 2014.
  • Kelly, Chelsea Emelie. “Beyond Digital: Open Collections & Cultural Institutions,” 2014.

Optional

Assignment

Proposal due Friday, March 23 at 11:59pm; Peer reviews due by class March 26 at 3pm

  • Using the Google Books REST API
  • New York Times article scrape
  • Scraping and Parsing XML
  • Fetching and Parsing Data from the Web with OpenRefine, APIs

Week 10 (4/2): Theory

Readings

Coding, writing, and exercises

  • Montfort chp. 10 "Text III"
  • Final project directory: Booth chp. 4 "From Questions to a Problem"

For discussion

  • Conley, Tara L. "Decoding Black Feminist Hashtags as Becoming" The Black Scholar Vol. 47 , Iss. 3, 2017.
  • Klein, Lauren. "Distant Reading after Moretti". MLA Conference, January 2018.
  • Ramsay, Stephen. “Chapter 1: An Algorithmic Criticism.” In Reading Machines: Toward an Algorithmic Criticism, 1–17. Topics in the Digital Humanities. Urbana: University of Illinois Press, 2011.
  • Underwood, T. "Theorizing Research Practices We Forgot to Theorize Twenty Years Ago". Representations, Vol. 127 No. 1, Summer 2014; (pp. 64-72).

Optional

Assignment

Discussion post

  • Unsupervised learning: Latent Dirichlet allocation (LDA) topic modeling
  • Supervised learning: Naive Bayes classification
  • Fetching and Parsing Data from the Web with OpenRefine, Advanced APIs

Week 11 (4/9): Methods

Readings

Coding, writing, and exercises

For discussion

Optional

  • Borgman chp. 7
  • Berendt, Bettina, Preibusch, Soren. Toward Accountable Discrimination-Aware Data Mining: The Importance of Keeping the Human in the Loop—and Under the Looking Glass. Big Data. Volume 5, Number 2, 2017.
  • Moretti, F. "Conjectures in World Literature" New Left Review 1, January-February 2000.

Assignment

Discussion post

Speaker: Maria Fernandez

  • Unsupervised learning with K-Means Clustering
  • Supervised learning with multiple classifiers: Naive Bayes, k-nearest neighbor, Logistic Regression, Support Vector Machine (SVM), Multi-layer perceptron classifier

Week 12 (4/16): Statistics and Visualization

Readings

Coding, writing, and exercises

  • Montfort chp. 11 “Statistics and Visualization.”
  • Brew, Chris. “Language Processing: Statistical Methods.” In Encyclopedia of Language & Linguistics, edited by Keith Brown, 2nd ed., 12:597–604. Elsevier, 2006.
  • Gries, Stefan. “Useful statistics for corpus linguistics.”.
  • McCandles, David. Information is Beautiful.
  • Norvig, Peter. “Natural Language Corpus Data.” In Beautiful Data: The Stories Behind Elegant Data Solutions, edited by Toby Segaran and Jeff Hammerbacher, 1st ed. Beijing ; Sebastopol, CA: O’Reilly, 2009.

For discussion

Optional

  • Moretti, Franco. “Graphs.” In Graphs, Maps, Trees: Abstract Models for Literary History, 3–33. London ; New York: Verso, 2007.
  • Ramsay, Stephen. “Chapter 3: Potential Readings.” In Reading Machines: Toward an Algorithmic Criticism, 33–57. Topics in the Digital Humanities. Urbana: University of Illinois Press, 2011.

Assignment

Discussion post

  • Matplotlib
    • Simple Viz for Sentiment Analysis
    • Much more Viz
  • Tableau

Week 13 (4/23): Features

Readings

Coding, writing, and exercises

  • Kazil, Jacqueline, and Katharine Jarmul. “PDFs and Problem Solving in Python.” In Data Wrangling with Python: Tips and Tools to Make Your Life Easier, 91–126. O’Reilly, 2016.
  • Albon, Chris. “Parse JSON File.” http://chrisalbon.com/python/json_parse_file.html
  • Lundh, Fredrik. “Elements and Element Trees.” http://effbot.org/zone/element.htm [Python XML tutorial]
  • Beazley, David, and Brian K. Jones. “Chapter 6: Data Encoding and Processing.” In Python Cookbook: recipes for Mastering Python 3, 3. ed., 175–216. Bejing: O’Reilly, 2013.

For discussion

  • Brown and Mandell, "The Identity Issue: An Introduction." Journal of Cultural Analytics 13 February 2018.
  • Hammond, Adam. "The double bind of validation: distant reading and the digital humanities' 'trough of disillusionment." Literature Compass 14, no. 8 (August 1, 2017): no. pg.
  • Witmore, Michael. 2016. “Latour, the Digital Humanities, and the Divided Kingdom of Knowledge.” New Literary History 47 (2): 353–75.

Have some fun!

Optional

Assignment

Discussion post


Week 14 (4/30): Final Presentations

Final Presentation due

5/7: Final Project due


Additional resources:

Programming tutorials

Installation Tutorials

Readings