Skip to content

Commit

Permalink
ruby client fix + documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
scivey committed Oct 13, 2015
1 parent a725237 commit 6270c7b
Show file tree
Hide file tree
Showing 3 changed files with 75 additions and 1 deletion.
2 changes: 1 addition & 1 deletion clients/ruby/client/lib/relevanced_client.rb
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def create_document(text)
end

def create_document_with_id(document_id, text)
@thrift_client.createDocumentWithId(
@thrift_client.createDocumentWithID(
document_id, text
)
end
Expand Down
73 changes: 73 additions & 0 deletions docs/examples/ruby-binary-classifier.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
## Binary Classifier (Ruby)

In this example, each of a small collection of Wikipedia articles is added to a relevanced server and assigned to the centroid `math`. Similarity scores for another math-related article and an irrelevant article on Richard Gere are then calculated.

This example uses [nokogiri](http://www.nokogiri.org/) to crawl the Wikipedia pages and extract their text. Note that `open-uri` is part of the Ruby standard library.

Also note that by far the slowest part of the script is fetching and crawling the Wikipedia pages.

```ruby
require 'relevanced_client'
require 'nokogiri'
require 'open-uri'

def run()
client = RelevancedClient::Client.new('localhost', 8097)
client.create_centroid('math')

MATH_URLS.each do |url|
text = fetch_url(url)
client.create_document_with_id(url, text)
client.add_document_to_centroid('math', url)
end

client.join_centroid('math')

TEST_URLS.each do |url|
text = fetch_url(url)
similarity = client.get_text_similarity('math', text)
puts "#{url} -> #{similarity}"
end
end

def fetch_url(url)
page = Nokogiri::HTML(open(url))
paragraphs = page.css('#mw-content-text > p').map { |x| x.text }
paragraphs.join("\n")
end

MATH_URLS = [
'https://en.wikipedia.org/wiki/Linear_programming',
'https://en.wikipedia.org/wiki/Algorithm',
'https://en.wikipedia.org/wiki/Scalar_multiplication',
'https://en.wikipedia.org/wiki/Mathematical_structure',
'https://en.wikipedia.org/wiki/Dot_product',
'https://en.wikipedia.org/wiki/Analysis_of_algorithms',
'https://en.wikipedia.org/wiki/Linear_algebra',
'https://en.wikipedia.org/wiki/Criss-cross_algorithm',
'https://en.wikipedia.org/wiki/Support_vector_machine',
'https://en.wikipedia.org/wiki/Mathematics',
'https://en.wikipedia.org/wiki/Rational_number',
'https://en.wikipedia.org/wiki/Fraction_(mathematics)',
'https://en.wikipedia.org/wiki/Square_root_of_2',
'https://en.wikipedia.org/wiki/Mathematical_optimization',
'https://en.wikipedia.org/wiki/Optimization_problem',
'https://en.wikipedia.org/wiki/Candidate_solution',
'https://en.wikipedia.org/wiki/Search_algorithm'
]

TEST_URLS = [
'https://en.wikipedia.org/wiki/Matrix_multiplication',
'https://en.wikipedia.org/wiki/Richard_Gere'
]

begin
run()
end
```

As of 10/12/2015 (the article text may change), the result is:
```bash
https://en.wikipedia.org/wiki/Matrix_multiplication -> 0.4271200925196943
https://en.wikipedia.org/wiki/Richard_Gere -> 0.07472419631570816
```
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ pages:
- Binary Classifier (Python): examples/python-binary-classifier.md
- Multiclass Classifier (Python): examples/python-multiclass-classifier.md
- Binary Classifier (Javascript): examples/javascript-binary-classifier.md
- Binary Classifier (Ruby): examples/ruby-binary-classifier.md
- Binary Classifier (Scala): examples/scala-binary-classifier.md
- Binary Classifier (Java): examples/java-binary-classifier.md
- Multiclass Classifier (Java): examples/java-multiclass-classifier.md
Expand Down

0 comments on commit 6270c7b

Please sign in to comment.