Spike: As tech lead, I need elastic re-indexing to be automated #2870

ADPennington · 2024-03-01T14:44:31Z

Description:

As mentioned in #2820, python manage.py search_index --rebuild is needed to facilitate elastic re-indexing.

This is a manual step currently that was needed to yield expected results in QASP review for parsing/validation tickets like #2825. This will also be a manual step before releasing code to hhs:main and hhs:master.

This is a step that would be better to automate.

Acceptance Criteria:
Create a list of functional outcomes that must be achieved to complete this issue

elasticsearch indexes are rebuilt/refreshed when new model changes are available.
Testing Checklist has been run and all tests pass
README is updated, if necessary

Tasks:
Create a list of granular, specific work items that must be completed to deliver the desired outcomes of this issue

Run python manage.py search_index --rebuild with the --parallel flag over the weekend to gauge
- how long the task will actually take in prod with the larger elasticsearch instance
- if the --parallel flag makes a difference
- if data can be added while indexing is happening
Adjust BulkIndexError exception handling - should not roll-back un-indexed records, but instead write them to postgres and ignore the indexing step
cron/beat task to reindex the entire database periodically
Mitigation plan - how can we get as close to zero downtime as possible?
Run Testing Checklist and confirm all tests pass

Notes:
Possible approaches

similar to apply-remote-migrations - rebuild search indexes after backend deployment (via ssh or cf run-task)
have a celery beat/cron job to periodically reindex
- caveat: need to change our handling of BulkIndexException to not roll back un-indexed records
- or - reindex post bulk_create (after parsing completion for each file)
write a custom indexing routine using bulk requests (per "Tune for indexing speed", linked below)
- django-elasticsearch-dsl seems to use the _bulk endpoint when making requests, we may need to further
  investigate how the library works and/or introduce some customization to tune it to our needs
- alternatively, use elastic's reindex api to reindex previously indexed data without deleting/recreating (as done by django-elasticsearch-dsl), then try to bulk index any un-indexed data
increase the resources available to the elastic cluster (per "Tune for indexing speed", linked below)
- currently es-dev in dev/staging and es-medium in prod
utilize the --parallel option in python manage.py search_index --rebuild - https://django-elasticsearch-dsl.readthedocs.io/en/latest/management.html
utilize the --use-alias option in python manage.py search_index --rebuild

Supporting Documentation:
Please include any relevant log snippets/files/screen shots

Elastic - Tune for indexing speed
Doc 2

Open Questions:
Please include any questions or decisions that must be made before beginning work or to confidently call this issue complete

Open Question 1
Open Question 2

The text was updated successfully, but these errors were encountered:

ADPennington · 2024-03-07T15:28:10Z

@jtimpe below are the metrics after releasing sprint 93 to staging yesterday:

Number of data files: 394
Number of db records: ~872K
- SSP T1: N=362
- SSP T2: N=432
- SSP T3: N=785
- SSP T4: N=2206
- SSP T5: N=6739
- SSP T6: N=12
- SSP T7: N=45
- TANF T1: N=214226
- TANF T2: N=238098
- TANF T3: N=403698
- TANF T4: N=895
- TANF T5: N=2423
- TANF T6: N=57
- TANF T7: N=48
- Tribal TANF T1: N= 360
- Tribal TANF T2: N= 549
- Tribal TANF T3: N= 810
- Tribal TANF T4: N= 116
- Tribal TANF T5: N= 355
- Tribal TANF T6: N= 18
- Tribal TANF T7: N= 1
Number of minutes it took to run this: 35 minutes ⚠️

Update: there are 6.6 million records in prod as of today.

ADPennington · 2024-03-11T12:36:35Z

@jtimpe do you happen to know if the re-index command impacts the newest filter in any way?

I'm noticing in staging (which is on sprint 93 release), that the latest submission isn't flagged as "newest" in DAC. I tried submitting the same file in (2023.Q3.Aggregate Data.txt) in develop and staging and got different results in DAC:

file

develop T6

staging T6

jtimpe · 2024-03-27T18:43:48Z

Notes from testing in raft (wip)
https://hackmd.io/@dBEtH2T9SRqyVnE3ZKtYSA/ry7GWLP0a/edit

robgendron · 2024-04-09T14:12:44Z

Potential spike - still working through it.

robgendron · 2024-04-12T18:40:22Z

Relabeled to spike.

jtimpe · 2024-04-29T19:18:12Z

data lifecycle

remove "old" data, esp. when deploying logic updates that makes some data obsolete
- submitted before certain date
- data for prior submission period
- fiscal period (test fy 22)
delete and reparse cron job or manual task - re-code everything since last reporting period
- draft a ticket - this has use beyond this specific issue
take advantage of this time to reindex elastic

other options

host elastic app ourselves

reitermb · 2024-05-06T16:04:24Z

Moving into Raft Review this week but staying in blocked pending testing in staging

robgendron · 2024-05-14T16:07:02Z

PR for this is in QASP, will be able to test soon.

ADPennington added backend dev database For issues primarily related to schema changes labels Mar 1, 2024

ADPennington mentioned this issue Mar 1, 2024

DOB Updates #2825

Merged

33 tasks

jtimpe self-assigned this Mar 7, 2024

jtimpe mentioned this issue Mar 8, 2024

Change handling of elastic BulkIndexError #2880

Merged

28 tasks

This was referenced Mar 11, 2024

2870 elastic reindex customizations #2881

Merged

[bug] Uncaught exception re: parsing error preventing feedback report generation. #2820

Closed

ADPennington mentioned this issue Mar 26, 2024

File Rejection Criterion #2867

Merged

32 tasks

robgendron added the Parity Work associated with TDP Parity label Mar 26, 2024

robgendron added the spike label Apr 12, 2024

jtimpe mentioned this issue Apr 16, 2024

Create cron/post-deploy task for conditional elasticsearch reindexing #2947

Open

9 tasks

robgendron changed the title ~~As tech lead, I need elastic re-indexing to be automated~~ Spike: As tech lead, I need elastic re-indexing to be automated Apr 17, 2024

andrew-jameson mentioned this issue May 7, 2024

As sys admin, I want to be able to reparse datafile sets #2978

Closed

8 tasks

jtimpe closed this as completed in #2881 May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spike: As tech lead, I need elastic re-indexing to be automated #2870

Spike: As tech lead, I need elastic re-indexing to be automated #2870

ADPennington commented Mar 1, 2024 •

edited by jtimpe

Loading

ADPennington commented Mar 7, 2024 •

edited

Loading

ADPennington commented Mar 11, 2024

jtimpe commented Mar 27, 2024

robgendron commented Apr 9, 2024

robgendron commented Apr 12, 2024

jtimpe commented Apr 29, 2024 •

edited

Loading

reitermb commented May 6, 2024

robgendron commented May 14, 2024

Spike: As tech lead, I need elastic re-indexing to be automated #2870

Spike: As tech lead, I need elastic re-indexing to be automated #2870

Comments

ADPennington commented Mar 1, 2024 • edited by jtimpe Loading

ADPennington commented Mar 7, 2024 • edited Loading

ADPennington commented Mar 11, 2024

jtimpe commented Mar 27, 2024

robgendron commented Apr 9, 2024

robgendron commented Apr 12, 2024

jtimpe commented Apr 29, 2024 • edited Loading

reitermb commented May 6, 2024

robgendron commented May 14, 2024

ADPennington commented Mar 1, 2024 •

edited by jtimpe

Loading

ADPennington commented Mar 7, 2024 •

edited

Loading

jtimpe commented Apr 29, 2024 •

edited

Loading