Skip to content

Commit

Permalink
Version 1.0
Browse files Browse the repository at this point in the history
  • Loading branch information
adir-intsights committed Oct 8, 2020
1 parent c13810d commit 30f39d0
Show file tree
Hide file tree
Showing 7 changed files with 500 additions and 452 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
__pycache__
fast_elasticsearch_reindex.log
9 changes: 9 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
FROM python:3.8-slim

WORKDIR /app

COPY setup.py .
COPY fast_elasticsearch_reindex fast_elasticsearch_reindex
RUN pip install .

ENTRYPOINT ["python", "-m", "fast_elasticsearch_reindex"]
54 changes: 54 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Fast Elasticsearch Reindex

A Python based alternative to Elasticsearch Reindex API with multiprocessing
support. Since Elasticsearch Reindex API doesn't support slicing when reindexing
from a remote cluster, the entire process can take many hours or even days,
depending on the cluster size. Based on [Sliced
Scroll](https://www.elastic.co/guide/en/elasticsearch/reference/master/paginate-search-results.html#slice-scroll)
and [Bulk
requests](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html),
this utility can be used as a faster alternative.

![fast_elasticsearch_reindex](https://i.ibb.co/Z6QKybN/fast-elasticsearch-reindex.gif)

## Usage

```
usage: fast_elasticsearch_reindex [-h] [--src-hosts [SRC_HOSTS [SRC_HOSTS ...]]] [--dest-hosts [DEST_HOSTS [DEST_HOSTS ...]]]
[--query QUERY] [--workers WORKERS] [--size SIZE] [--scroll SCROLL] [--slice-field SLICE_FIELD]
[--indices [INDICES [INDICES ...]]]
optional arguments:
-h, --help show this help message and exit
--src-hosts [SRC_HOSTS [SRC_HOSTS ...]]
Source Elasticsearch hosts to reindex from (default: ['127.0.0.1:9200'])
--dest-hosts [DEST_HOSTS [DEST_HOSTS ...]]
Destination Elasticsearch hosts to reindex to (default: ['127.0.0.1:9201'])
--query QUERY Search query (default: {})
--workers WORKERS Number of parallel workers (default: 8)
--size SIZE Search request size (default: 2000)
--scroll SCROLL Scroll request duration (default: 5m)
--slice-field SLICE_FIELD
Field to slice by (default: None)
--indices [INDICES [INDICES ...]]
Indices to reindex (default: None)
```

## Installation

Pip:
```
pip install fast_elasticsearch_reindex
```

Local:
```
$ pip install .
$ python -m fast_elasticsearch_reindex --help
```

Docker:
```
$ docker build -t fast_elasticsearch_reindex .
$ docker run fast_elasticsearch_reindex --help
```
Loading

0 comments on commit 30f39d0

Please sign in to comment.