Machine-assisted identification of rhyme in Russian verse

David J. Birnbaum

Files and directories

Code

Notebooks

(Most recent first)

Extract and examine distances: Extract distance information from clustered object and look for cut-off
Explore linkage and distance: Exploration and evaluation of linkage methods and distance metrics
Samples notebook: dendrogram visualization of sample poems with different properties to explore behavior
Progress report 4 notebook: at last some machine learning (clustering)!
Progress report 3 notebook: decomposition into C(C) ~ V (replaces decomposition into syllable parts)
Syllable decomposition scratchpad: scratch space for decomposition into syllables and into C(C) ~ V
Progress report 2 notebook: decomposition into syllable parts (abandoned)
Progress report 1 notebook: data preparation, syllabification

Library

cyr2phon.py: Cyrillic to phonetic library; exposes cyr2phon.transliterate()
utility.py: Utility functions; exposes utility.syllabify()

Development

test_transliterate: Nose tests for transliteration code
test_utility: Nose tests for utility code (syllabification)
.travis.yml and requirements.txt: Configuration information for Travis CI

Data samples

eo1.xml: First book of Aleksandr Puškin’s Eugene Onegin; see the Progress report for a description of the data source
gippius_neljubov.xml: Zinaida Gippius, Neljubov (1907)
brjusov_voron.xml Valerij Brjusov’s translation of Edgar Allan Poe’s “The raven” (1924)

Documentation

project_plan.md: Original project proposal
degrees-of-rhyme.md: Examples or types of imperfect rhyme, with discussion
cluster-matching.md: Discussion of challenges in aligning heterosegmental consonant clusters
progress_report.md: Progress report (last updated 2019-02-17)
bibliography.md: Annotated bibliography

Acknowledgements

This project is a contribution to Meter, rhythm, and rhyme: the computationally assisted analysis of formal features in Russian poetry, co-developed by David J. Birnbaum and Elise Thorsen. The content here builds on collaborative work from that larger project.
Stanza markup in the Puškin corpus was implemented by Kyleen Pickering.
Thanks to Na-Rae Han, Jevon Heath, Daniel Zheng, and members of the University of Pittsburgh Spring 2019 Linguistics 1340 course for comments and suggestions.

Name		Name	Last commit message	Last commit date
Latest commit History 260 Commits
.idea		.idea
dev		dev
docs		docs
images		images
tolstoj		tolstoj
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt
russian_rhyme.xpr		russian_rhyme.xpr

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine-assisted identification of rhyme in Russian verse

Files and directories

Code

Notebooks

Library

Development

Data samples

Documentation

About

Releases

Packages

Languages

License

Data-Science-for-Linguists-2019/russian_rhyme

Folders and files

Latest commit

History

Repository files navigation

Machine-assisted identification of rhyme in Russian verse

Files and directories

Code

Notebooks

Library

Development

Data samples

Documentation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages