"The datastore operation timed out" with many shards #109

Erfa · 2016-12-27T10:46:41Z

I tried starting a mapper job with the DatastoreInputReader on an entity type that has about 30 million entities. I specified 30,000 shards, which causes the /mapreduce/kickoffjob_callback job to fail due to a Datastore timeout.

It might be ridiculous to use a shard amount that high, but I thought I'd share it anyway. Here's the full stack trace:

The datastore operation timed out, or the data was temporarily unavailable.
Traceback (most recent call last):
  File "/base/data/home/apps/e~appid/module:version/lib/mapreduce/base_handler.py", line 135, in post
    self.handle()
  File "/base/data/home/apps/e~appid/module:version/lib/mapreduce/handlers.py", line 1387, in handle
    readers, serialized_readers_entity = self._get_input_readers(state)
  File "/base/data/home/apps/e~appid/module:version/lib/mapreduce/handlers.py", line 1459, in _get_input_readers
    readers = input_reader_class.split_input(split_param)
  File "/base/data/home/apps/e~appid/module:version/lib/mapreduce/input_readers.py", line 722, in split_input
    return super(DatastoreInputReader, cls).split_input(mapper_spec)
  File "/base/data/home/apps/e~appid/module:version/lib/mapreduce/input_readers.py", line 355, in split_input
    query_spec.app, namespaces, shard_count, query_spec)
  File "/base/data/home/apps/e~appid/module:version/lib/mapreduce/input_readers.py", line 395, in _to_key_ranges_by_shard
    app)
  File "/base/data/home/apps/e~appid/module:version/lib/mapreduce/input_readers.py", line 464, in _split_ns_by_scatter
    random_keys = ds_query.Get(shard_count * oversampling_factor)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/datastore.py", line 1722, in Get
    return list(self.Run(limit=limit, offset=offset, **kwargs))
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/datastore/datastore_query.py", line 3321, in next
    next_batch = self.__batcher.next_batch(Batcher.AT_LEAST_OFFSET)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/datastore/datastore_query.py", line 3207, in next_batch
    batch = self.__next_batch.get_result()
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", line 613, in get_result
    return self.__get_result_hook(self)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/datastore/datastore_query.py", line 2906, in __query_result_hook
    self._batch_shared.conn.check_rpc_success(rpc)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1371, in check_rpc_success
    raise _ToDatastoreError(err)
Timeout: The datastore operation timed out, or the data was temporarily unavailable.

The text was updated successfully, but these errors were encountered:

speedplane · 2017-02-12T02:30:35Z

Unless you plan on running 30,000 jobs in parallel, you don't need to shard it that much. I start getting timeouts like the one you mention at around 1000 shards.

Erfa changed the title ~~The datastore operation timed out with many shards~~ "The datastore operation timed out" with many shards Dec 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"The datastore operation timed out" with many shards #109

"The datastore operation timed out" with many shards #109

Erfa commented Dec 27, 2016

speedplane commented Feb 12, 2017

"The datastore operation timed out" with many shards #109

"The datastore operation timed out" with many shards #109

Comments

Erfa commented Dec 27, 2016

speedplane commented Feb 12, 2017