-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ETL] Issues with monitoring/testing #110
Comments
Joy's Commentthe ACs look like they will be costly to implement since we are leveraging spring batch here and queuing and the tables are managed by it. For AC1 (monitoring), we can rely on logs as source of truth and ignore the DB. |
Himesh's CommentIn general, i agree with the issues that we aim to resolve here.. but have difference in the approach to resolve them though
|
Viveks comment:We are using Quartz btw, not spring batch for ETL.
We perhaps should use new Date here |
… actual job execution and add a higher priority trigger for first run of ETL Sync job for an org
AC1 fixed as per Vivek's input above and seems to work well. AC2 (metabase report) pending. Moving to code review ready so AC1 and AC3 can be tested. |
AC2 Metabase Reports: https://reporting.avniproject.org/question/4841-etl-round-completed-in-90-minutes Alerts can be enabled after this change is promoted due to inaccurate start/end times in scheduled_job_run |
Made slight additions to the first report to filter by "SyncJobs" job_group and show OrgCategory and OrgStatus values in readable format. Code review didn't result in any other issues of concern. |
Additionally create following reports for QA and others to determin ETL job status: |
Issue:
ETL for rwbngos2023 completed in a minute, but in database it looks like it took 15 mins giving a wrong picture
If you see in the below image as well, the start time of some jobs are earlier than the end time of other jobs. So this looks like either the start time of next job or end time of previous job is recorded incorrectly. This is posing issues for monitoring the ETL jobs.
AC:
- When ETL of an organisation of 'Organisation Category' - production or UAT and 'Organisation Status' - Live fails.
- When time taken to complete one round of ETL takes more than 1.5 hours
Technical analysis/suggestions:
Trigger trigger = TriggerBuilder.newTrigger()
.withIdentity("triggerName", "triggerGroup")
.startNow()
.build();
scheduled_job_run
table withqrtz_job_details
Ignore:
What:
Who:
The text was updated successfully, but these errors were encountered: