Replies: 3 comments 7 replies
-
This only works for database maintenance, schema migrations must be finished before the application is started.
How would this work exactly when different jobs have different schedules? Is there one Docker image that is called with different options or are there multiple Docker images? The listed examples are mandatory tasks, like the Keycloak synchronization. What is the plan to implement that for the Docker Compose test environment? If the runner for maintenance jobs was a permanent deployment, it would be possible to trigger maintenance jobs via an API endpoint, this would allow to trigger jobs also manually on demand, for example, when it is known that the Keycloak synchronization failed and the next scheduled execution is too far in the future. I am not sure if it is a good idea that the application is not aware of the maintenance jobs. This means, that it is not possible to implement an admin UI that shows any information about past and upcoming jobs. Finally, it looks like my draft for implementing maintenance jobs was ignored in this proposal. Why? |
Beta Was this translation helpful? Give feedback.
-
So, there are currently two proposals on the table: One is to delegate task execution to Kubernetes (in the following referred to as the "Kubernetes proposal"), and the other one is to have a deployment that runs a scheduler library like Quartz which handles the task execution (the "Quartz proposal"). After thinking a while about the Quartz proposal and its possible implementation, I come to the conclusion that there actually is not that much difference between both proposals. This also impacts some of the arguments brought up so far in favor to or against one of the proposals. What they have in common is:
The main differences are IMHO the following:
As arguments for the Quartz proposal, it was stated that it would give more control over the execution of tasks and would therefore open up further use cases (e.g. for monitoring or an admin UI), and everything would be contained in ORT Server without the need to further set things up. Regarding the first point, one can have a look into how the execution of tasks with Quartz looks like (and this is considered typical for similar libraries or tools). The tutorial gives the following example:
So, the deployment for executing tasks would probably have an entry point script that contains statements like those to configure the scheduler. This script can either hard-code the tasks to execute or use some magic (based on service loaders or reflection) to determine them. In any case, after the execution of the script, the scheduler is fully responsible for the task execution. Regarding control, this is exactly the same situation as in the Kubernetes scenario; the tasks are run by an external party, and in order to collect statistics or track the execution, additional means have to be implemented. Quartz may offer some hooks to receive notifications about task executions, but the same is true for Kubernetes. Only if an own scheduler mechanism is implemented in ORT Server, full control over task execution can be achieved. The other argument for the Quartz proposal is that it simplifies the ORT Server deployment, since no external configuration is needed to set up the task execution. This is, however, not the full truth: The deployment responsible for the task execution of course has to be defined and configured. This can be done (for instance) as part of a Helm chart; the complexity should be similar to other ORT Server components like core or Kubernetes Job Monitor, with one major difference: the schedule of the tasks needs to be configured as well. This has to be done via a mechanism that is still to be defined. Probably, there will be configuration properties or environment variables corresponding to the single tasks supported by ORT Server and expressions when they should be executed. These settings are then evaluated by the entry point script of the deployment that creates the task definitions and passes them to Quartz. This is a proprietary mechanism which needs to be adapted every time another task class is added. In the Kubernetes proposal, at least standard configuration is used to configure cron jobs which should be familiar to people with an Ops background. Also, it is also unclear how this mechanism could work with other task implementations not contained in the ORT Server codebase. To sum this up, both proposals define a similar model for task execution that basically delegates the execution of ORT Server tasks to an external engine. They therefore share common characteristics with regard to control over execution or additional configuration. |
Beta Was this translation helpful? Give feedback.
-
I just want to highlight #1687 in this context, which is about being able to automatically trigger (partial) runs based on events, including timers. The motivation is that for users who "live" exclusively in the server UI (and do not trigger jobs via the REST API from Ci via the CLI), there should be a way to configure regular reruns of e.g. the advisor to check for newly discovered security vulnerabilities for unchanged code. How does this use-case fit to the proposals? |
Beta Was this translation helpful? Give feedback.
-
Motivation
In ORT Server, there are multiple use cases that require some sort of periodic job execution. In the current implementation, some (limited) scheduling functionality has already been implemented for Kubernetes Job Monitor to check for long-running or lost jobs in certain intervals. For new pending features, running specific jobs in a customizable schedule will be important as well, for instance:
For complex and long-running database migrations, it may be necessary to run them as background jobs that process data chunk-wise. In such a scenario, a migration job would be run in specific intervals until the whole migration has finished.Removed 2024-11-04 after feedback: This is actually a more a one-time job, and not a periodic job.The objective is to implement a technical solution that allows to easily add periodic jobs, and to execute them at arbitrary points in time.
There are the following constraints:
a.) The impact of the execution of periodic jobs on the processing of scan-jobs should be kept as low as possible. There shall be no major performance degradation of the system when a periodic job is running, nor should periodic jobs have negative impact on the stability of regular operation.
b.) The effort for implementation should be kept as low as possible. It is not the goal to implement a scheduler or job management system within ORT Server, as this is not seen as key competency of ORT Server and there are already enough scheduler and job management systems out there that can be used off-the-shelf.
Proposed Solution
ps1: Location of code of the jobs: Like for the (already implemented) Job Monitor, the code is located inside the ORT Server repository. The Kotlin/Gradle Module System provides modules for specific purposes, e.g. for database access or for common functionality used by workers. This allows for a convenient programming model where components can be integrated with functionality provided by other parts of the codebase.
ps2: Standard programming interface for jobs: Jobs standardized interfaces for lookup, creation (factory) and configuration.
ps3: Execution of jobs: Like for the Job Monitor, the jobs run in a separate Kubernetes pod. This ensures to keep the impact of the regular scan activities as low as possible and creates no risks for robustness of ORT Server operation.
ps4: Scheduling by date/time: For executing jobs periodically at a specific point in time, an external scheduler is used. In case that ORT Server is running in a Kubernetes environment, this could be a Kubernetes CronJob. This keeps implementation efforts at a minimum and is a proven solution. Documentation will be added how adopters of ORT Server can use these CronJobs. These CronJobs also have features like to make sure that not more than one job of a kind runs at a time.
ps5: Reporting: In order to get information about job executions, duration of executions and the outcome of the jobs, the features of the external scheduling solution (Kubernetes Cron Jobs) is regarded as sufficient at the moment.
Next steps
Beta Was this translation helpful? Give feedback.
All reactions