Skip to content

Latest commit

 

History

History
74 lines (54 loc) · 2.85 KB

README.md

File metadata and controls

74 lines (54 loc) · 2.85 KB

Butter

A tf-idf JavaScript library

Butter

Purpose

This is a javascript library that can be used for finding out (the most frequently used words on a webpage using tf-idf). It was initially made for recognizing cooking ingredients from recipes web sites, please modify for use in other domains.

Requirements

Thanks

Add this to the head section your webpage (change the library paths acorddingly), to see how it works

To Test this

	<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js"></script>
  <script src="http://cdnjs.cloudflare.com/ajax/libs/underscore.js/1.3.3/underscore-min.js"></script>
  <script src="../lib/stopwords.js"></script>
  <script src="../lib/tfidf.js"></script>
  <script src="../lib/tokenize.js"></script>
  <script src="../lib/corpus_tools.js"></script>
  <script src="../lib/collections_tools.js"></script>
  <script src="../lib/stemmer-min.js"></script>
  <script src="../test_data/test_data.js"></script>
  <script>
    $(function() {
      var corpus = "";
      // if($('li.ingredient.type').length>0){
      //  alert(getTextNodesIn('.ingredient.type').text());
      // }
      if($('li.ingredient').length>0){ // here use recipes microformats
        var items = getTextNodesIn('li.ingredient').text()
        alert(items);
      }
      else{ // don't use recipes microformat, scan the whole text
        corpus  = getTextNodesIn('div').text();
        alert(analyze_web_text(corpus));
      }
  });
  </script>

TODO

  • create a GreaseMonkey / Chrome Extension

resources

more info about recipes

This library doens't need to make use of recipes microformat to work, but if you would like more info Recipes Microformat

Licence

MIT

### changelog 2013-03-01 Joyce Chan
  • initial release