Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can elasticgeo support sliced scroll-scan? #108

Open
itakyubi opened this issue Sep 17, 2019 · 4 comments
Open

Can elasticgeo support sliced scroll-scan? #108

itakyubi opened this issue Sep 17, 2019 · 4 comments

Comments

@itakyubi
Copy link

No description provided.

@sjudeng
Copy link
Contributor

sjudeng commented Sep 19, 2019

Hello. No it's not currently supported.

@halfstein
Copy link

Hi @sjudeng, I work with @johndeereguy and have some updates to ElasticFeatureReaderScroll and ElasticFeatureSource I'd like to prepare for a pull request. It adds support for a true ES Scroll so seems like it could apply to this issue (although not sliced). I have the changes in our version split from 2.12.2.

Would this be helpful? And if so, which branch should I work to initially add changes to?

Here's javadoc added to ElasticFeatureSource:

For large feature sets, this source can either scroll or page the results. Although the {@link ElasticFeatureReaderScroll} uses an ES scroll either way, scrolling and paging are very different operations.

Scrolled results are automatically returned to a single client request when the Query selects more results than the {@link ElasticDataStore#getScrollSize()}. This allows getting more records from ES than the default 10k limit but is still capped by the layer per-request max feature limit.

Paged results are returned to a sequence of client requests, activated by using the WFS 2.0 STARTINDEX and COUNT parameters. This allows getting more results from a layer than the per-request max features limit (in COUNT size chunks). The COUNT provided by the client overrides the scroll size, so it is the client's responsibility to use a size below 10k (or the ES limit). Also, paged results must begin with STARTINDEX=0 and must advance through STARTINDEXs in constant COUNT intervals. That is, the pages can not be reversed, skipped or randomly accessed, and page size can not be changed.

NOTE: To support paged results, the {@link ElasticFeatureReaderScroll} is cached in the users {@link HttpSession}. A {@link CompletableFuture} is also created and asynchronously run to close that scroll if the client does not read subsequent pages.

NOTE: This gives the impression that we could work in a multi-GeoServer environment, but almost certainly the scroll and the timeout would not serialize and migrate with the session.

@sjudeng
Copy link
Contributor

sjudeng commented Mar 5, 2020

@halfstein I think this would be a great feature to add to the project. The documentation is great to see as well. Thanks for taking the time to contribute this back.

You can target the master branch in this project. There's also a branch in my separate GeoTools fork here but I'd like to keep this project maintained until that's all merged and released. You're welcome to separately open a PR against the GeoTools branch and maybe we can get it included as part of that merge but I can also handle that later myself if that's inconvenient (the structure is very different).

Thanks again for reaching out and for your work on this feature.

@halfstein
Copy link

@sjudeng great ... I'll start working on it.

halfstein pushed a commit to halfstein/elasticgeo that referenced this issue Mar 13, 2020
Also provides results greater than the ES max hits (default 10k), and
perhaps address some of the enhancement request in issue ngageoint#108.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants