melk

Corpus Gathering and Analysis Framework

Project Description

Project Melk is a tool that allows digital humanities instructors and students without significant technical backgrounds to easily collect large datasets about specific research topics from social networks and other online media sources. The tool is pedagogical in nature- rather than attempt to provide fine grained control of every aspect of data collection, it gives students a simple, approachable interface with which to collect datasets that will allow them to explore digital text analysis methods.

The ultimate goal of this project is to make an important research method accessible to students and researchers without backgrounds in computer science. We hope it will enable professors to introduce their students to the novel insights enabled by computational analysis methods without requiring them to spend a prohibitively high amount of time wrestling with the mechanics of data collection.

Supported Sources

The New York Times
Reddit
Twitter
Local datasets including:
- Billboard Top 100 Song Lyrics Archive
- Poetry Foundation Archive
- State of the Union Archive

Usage

Project Melk is primarily intended to be used via its web interface, which is currently under development.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

melk

Project Description

Supported Sources

Usage

Files

README.md

Latest commit

History

README.md

File metadata and controls

melk

Project Description

Supported Sources

Usage