Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexer sunspot_type #34

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
28 changes: 14 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -695,10 +695,10 @@ in the query.
```ruby
# Search for all books with specified title and facets
# on the timestamp of reviews.
# An additional filter on children can be specified inside the on_child operator.
# An additional filter on children can be specified inside the on_child operator.
Sunspot.search(Book) do
fulltext 'awesome book title', fields: [:title]

json_facet :review_date, block_join: (on_child(Review) do
with(:review_date).greater_than(DateTime.parse('2015-01-01T00:00:00Z'))
end)
Expand All @@ -715,9 +715,9 @@ Faceting is performed on parents of the children found in the query.
# Perform faceting on the book category.
Sunspot.search(Review) do
with :author, 'yonik'

# An empty block means no additional filters: takes all parents
# of the selected children.
# of the selected children.
json_facet :category, block_join: on_parent(Book) {}
end
```
Expand Down Expand Up @@ -1114,12 +1114,12 @@ class Post < ActiveRecord::Base
end
```

As a result, all `Blogs` and `Posts` will be stored on a single shard. But
As a result, all `Blogs` and `Posts` will be stored on a single shard. But
since other `Blogs` will generate other prefixes Solr will distribute them
evenly across the available shards.

If you have large collections that you want to use joins with and still want to
utilize sharding instead of storing everything on a single shard, it's also
If you have large collections that you want to use joins with and still want to
utilize sharding instead of storing everything on a single shard, it's also
possible to only ensure a single `Blog` and its associated `Posts` stored on
a signle shard, while the whole collections could still be distributed across
multiple shards. The thing is that Solr **can** do distributed joins across
Expand Down Expand Up @@ -1150,15 +1150,15 @@ class Post < ActiveRecord::Base
end
```

This way a single `Blog` and its `Ports` have the same ID prefix and will go
This way a single `Blog` and its `Ports` have the same ID prefix and will go
to a single Shard.

*NOTE:* Solr developers also recommend adjusting replication factor so every shard
node contains replicas of all shards in the cluster. If you have 4 shards on separate
nodes each of these nodes should have 4 replicas (one replica of each shard).

More information and usage examples could be found here:
https://lucene.apache.org/solr/guide/6_6/shards-and-indexing-data-in-solrcloud.html
More information and usage examples could be found here:
https://lucene.apache.org/solr/guide/6_6/shards-and-indexing-data-in-solrcloud.html

### Highlighting

Expand Down Expand Up @@ -1268,7 +1268,7 @@ from 1984.
```ruby
search = Sunspot.search(Book) do
with(:pub_year).greater_than(1983)

# The :on parameter is needed here!
# It must match the type specified in :block_join
stats :stars, sort: :avg, on: Review do
Expand All @@ -1282,7 +1282,7 @@ end
Solr will execute the query, selecting all `Book`s with `pub_year` from 1984.

Then, facets on the `author_name` values present in the `Review` documents
that are children of the `Book`s found.
that are children of the `Book`s found.
In this case, we'll have just one facet.

At last, executes statistics on the generated facet.
Expand Down Expand Up @@ -1485,7 +1485,7 @@ contents in Solr.

Stored fields allow data to be retrieved without also hitting the
underlying database (usually an SQL server).
The store option using DocValues as stored is not like having the value really stored in the index, if you want to use
The store option using DocValues as stored is not like having the value really stored in the index, if you want to use
highlighting and more like this queries and atomic updates, remember to change the schema.xml according to this.

Stored fields (stored="true" in the schema) come at some performance cost in the Solr index, so use
Expand Down Expand Up @@ -1598,7 +1598,7 @@ Sunspot.session = Sunspot::SessionProxy::ThreadLocalSessionProxy.new

Within a Rails app, to ensure your `config/sunspot.yml` settings are properly setup in this session you can use [Sunspot::Rails.build_session](http://sunspot.github.io/sunspot/docs/Sunspot/Rails.html#build_session-class_method) to mirror the normal Sunspot setup process:
```ruby
session = Sunspot::Rails.build_session Sunspot::Rails::Configuration.new
session = Sunspot::Rails.build_session Sunspot::Rails::Configuration.new
Sunspot.session = session
```

Expand Down
186 changes: 93 additions & 93 deletions sunspot/lib/sunspot/indexer.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
require 'sunspot/batcher'

module Sunspot
#
#
# This class presents a service for adding, updating, and removing data
# from the Solr index. An Indexer instance is associated with a particular
# setup, and thus is capable of indexing instances of a certain class (and its
Expand All @@ -12,7 +12,7 @@ def initialize(connection)
@connection = connection
end

#
#
# Construct a representation of the model for indexing and send it to the
# connection for indexing
#
Expand All @@ -35,7 +35,7 @@ def add(model)
# updates<Hash>:: hash of updates where keys are model ids
# and values are hash with property name/values to be updated
#
def add_atomic_update(clazz, updates={})
def add_atomic_update(clazz, updates = {})
documents = updates.map { |id, m| prepare_atomic_update(clazz, id, m) }
add_batch_documents(documents)
end
Expand Down Expand Up @@ -93,109 +93,109 @@ def flush_batch

private

def batcher
@batcher ||= Batcher.new
end

#
# Convert documents into hash of indexed properties
#
def prepare_full_update(model)
document = document_for_full_update(model)
setup = setup_for_object(model)
if boost = setup.document_boost_for(model)
document.attrs[:boost] = boost
def batcher
@batcher ||= Batcher.new
end
setup.all_field_factories.each do |field_factory|
field_factory.populate_document(document, model)
end
unless setup.child_field_factory.nil?
setup.child_field_factory.populate_document(
document,
model,
adapter: ->(child_model) { prepare_full_update(child_model) }
)
end
document
end

def prepare_atomic_update(clazz, id, updates = {})
document = document_for_atomic_update(clazz, id)
setup = setup_for_class(clazz)
# Child documents must be re-indexed with parent at each update,
# otherwise Solr would discard them.
unless setup.child_field_factory.nil?
raise 'Objects with child documents can\'t perform atomic updates'
end
setup.all_field_factories.each do |field_factory|
if updates.has_key?(field_factory.name)
field_factory.populate_document(document, nil, value: updates[field_factory.name], update: :set)
#
# Convert documents into hash of indexed properties
#
def prepare_full_update(model)
document = document_for_full_update(model)
setup = setup_for_object(model)
if boost = setup.document_boost_for(model)
document.attrs[:boost] = boost
end
setup.all_field_factories.each do |field_factory|
field_factory.populate_document(document, model)
end
unless setup.child_field_factory.nil?
setup.child_field_factory.populate_document(
document,
model,
adapter: ->(child_model) { prepare_full_update(child_model) }
)
end
document
end
document
end

def add_documents(documents)
@connection.add(documents)
end
def prepare_atomic_update(clazz, id, updates = {})
document = document_for_atomic_update(clazz, id)
setup = setup_for_class(clazz)
# Child documents must be re-indexed with parent at each update,
# otherwise Solr would discard them.
unless setup.child_field_factory.nil?
raise 'Objects with child documents can\'t perform atomic updates'
end
setup.all_field_factories.each do |field_factory|
if updates.has_key?(field_factory.name)
field_factory.populate_document(document, nil, value: updates[field_factory.name], update: :set)
end
end
document
end

def add_batch_documents(documents)
if batcher.batching?
batcher.concat(documents)
else
add_documents(documents)
def add_documents(documents)
@connection.add(documents)
end
end

#
# All indexed documents index and store the +id+ and +type+ fields.
# These methods construct the document hash containing those key-value
# pairs.
#
def document_for_full_update(model)
RSolr::Xml::Document.new(
id: Adapters::InstanceAdapter.adapt(model).index_id,
type: Util.superclasses_for(model.class).map(&:name)
)
end
def add_batch_documents(documents)
if batcher.batching?
batcher.concat(documents)
else
add_documents(documents)
end
end

def document_for_atomic_update(clazz, id)
if Adapters::InstanceAdapter.for(clazz)
#
# All indexed documents index and store the +id+ and +type+ fields.
# These methods construct the document hash containing those key-value
# pairs.
#
def document_for_full_update(model)
RSolr::Xml::Document.new(
id: Adapters::InstanceAdapter.index_id_for(clazz.name, id),
type: Util.superclasses_for(clazz).map(&:name)
id: Adapters::InstanceAdapter.adapt(model).index_id,
type: Util.superclasses_for(model.class).map(&:name)
)
end
end

#
# Get the Setup object for the given object's class.
#
# ==== Parameters
#
# object<Object>:: The object whose setup is to be retrieved
#
# ==== Returns
#
# Sunspot::Setup:: The setup for the object's class
#
def setup_for_object(object)
setup_for_class(object.class)
end
def document_for_atomic_update(clazz, id)
if Adapters::InstanceAdapter.for(clazz)
RSolr::Xml::Document.new(
id: Adapters::InstanceAdapter.index_id_for(clazz.name, id),
type: Util.superclasses_for(clazz).map(&:name)
)
end
end

#
# Get the Setup object for the given class.
#
# ==== Parameters
#
# clazz<Class>:: The class whose setup is to be retrieved
#
# ==== Returns
#
# Sunspot::Setup:: The setup for the class
#
def setup_for_class(clazz)
Setup.for(clazz) || raise(NoSetupError, "Sunspot is not configured for #{clazz.inspect}")
end
#
# Get the Setup object for the given object's class.
#
# ==== Parameters
#
# object<Object>:: The object whose setup is to be retrieved
#
# ==== Returns
#
# Sunspot::Setup:: The setup for the object's class
#
def setup_for_object(object)
setup_for_class(object.class)
end

#
# Get the Setup object for the given class.
#
# ==== Parameters
#
# clazz<Class>:: The class whose setup is to be retrieved
#
# ==== Returns
#
# Sunspot::Setup:: The setup for the class
#
def setup_for_class(clazz)
Setup.for(clazz) || raise(NoSetupError, "Sunspot is not configured for #{clazz.inspect}")
end
end
end
Loading