Skip to content

Cowa/parallel-document-identification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Parallel document identification

A simple hapax-based method to identify parallel documents

Report

The report (in french) can be found here.

Where is the data?

The data used was Wikipédia articles in french and english.
But I was not allowed to publish it here (obviously).

Anyway it was too big.

About

A simple hapax-based method to identify parallel documents

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages