GitHub - cedeno/rosetta-analysis: how many words does rosetta stone teach you

Simple project to parse the index files from rosetta stone's pdf "index" to determine how many words there are.

The src/parse_index.lua can parse the vocab/rosetta_eng_n.txt files (where n is 1-5)

In the vocab directory, all the rosetta_eng_n.txt (where n is 1-5) are the words as listed in the index of rosetta stone's site.

The vocab_eng_n.txt where n is 1-5, are the processed versions of the rosetta_eng_n.txt files after running it through src/parse_index.lua

vocab_eng_full.txt is the concatenated version of all the vocab_eng_n.txt files vocab_eng_full_sort is the sorted uniq'd version of the file

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
vocab		vocab
README.md		README.md

Provide feedback