Skip to content

A tool to statistically differentiate between plaintexts and random strings.

Notifications You must be signed in to change notification settings

MainframeKnight/ChiFrequency

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChiFrequency

The goals of this project are:

  • to design and implement a tool for statistical analysis of plaintexts;
  • to give some examples of its usage in cryptography (in the current state the only cipher supported is the Caesar cipher);
  • to design a client-server model using GO and JS to create a user-friendly web-experience.

The main statistical test used here is the chi-squared test for bigrams (combinations of pairs of letters are used as groups) with added support for inclusion of spaces in ciphertext. Because it is quite rare for an enciphered message to contain information about spaces, a small change is applied should the ciphertext not contain any spaces: the approximate number of spaces in English is known to be 1 space for 4.7 letters (in the code it is approximated as 1 to 5 to make the changes induced by this value a bit rarer), so for each estimated space the largest value of the chi-square statistic (corresponding to a rare bigram) is removed from the result (as in the string "amhungry" the absent space creates a highly unlikely bigram "mh", which gets in the way of easy statistical analysis) (a better approach is to reduce the number of bigrams of the rarest type by one for each modelled space, so it'll probably get fixed in the future).

About

A tool to statistically differentiate between plaintexts and random strings.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published