Skip to content

Latest commit

 

History

History
98 lines (88 loc) · 3.15 KB

64_Geohash_grid_agg.asciidoc

File metadata and controls

98 lines (88 loc) · 3.15 KB

Geohash Grid Aggregation

The number of results returned by a query may be far too many to display each geo-point individually on a map. The geohash_grid aggregation buckets nearby geo-points together by calculating the geohash for each point, at the level of precision that you define.

The result is a grid of cells—​one cell per geohash—​that can be displayed on a map. By changing the precision of the geohash, you can summarize information across the whole world, by country, or by city block.

The aggregation is sparse—it returns only cells that contain documents. If your geohashes are too precise and too many buckets are generated, it will return, by default, the 10,000 most populous cells—​those containing the most documents. However, it still needs to generate all the buckets in order to figure out which are the most populous 10,000. You need to control the number of buckets generated by doing the following:

  1. Limit the result with a geo_bounding_box query.

  2. Choose an appropriate precision for the size of your bounding box.

GET /attractions/restaurant/_search
{
  "size" : 0,
  "query": {
    "constant_score": {
      "filter": {
        "geo_bounding_box": {
          "location": { (1)
            "top_left": {
              "lat":  40.8,
              "lon": -74.1
            },
            "bottom_right": {
              "lat":  40.4,
              "lon": -73.7
            }
          }
        }
      }
    }
  },
  "aggs": {
    "new_york": {
      "geohash_grid": { (2)
        "field":     "location",
        "precision": 5
      }
    }
  }
}
  1. The bounding box limits the scope of the search to the greater New York area.

  2. Geohashes of precision 5 are approximately 5km x 5km.

Geohashes with precision 5 measure about 25km2 each, so 10,000 cells at this precision would cover 250,000km2. The bounding box that we specified measures approximately 44km x 33km, or about 1,452km2, so we are well within safe limits; we definitely won’t create too many buckets in memory.

The response from the preceding request looks like this:

...
"aggregations": {
  "new_york": {
     "buckets": [ (1)
        {
           "key": "dr5rs",
           "doc_count": 2
        },
        {
           "key": "dr5re",
           "doc_count": 1
        }
     ]
  }
}
...
  1. Each bucket contains the geohash as the key.

Again, we didn’t specify any sub-aggregations, so all we got back was the document count. We could have asked for popular restaurant types, average price, or other details.

Tip

To plot these buckets on a map, you need a library that understands how to convert a geohash into the equivalent bounding box or central point. Libraries exist in JavaScript and other languages that will perform this conversion for you, but you can also use information from [geo-bounds-agg] to perform a similar job.