-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load simulation results from DB before scheduling + performance improvements #1051
Conversation
Here are some actions I am going to take following the walkthrough this morning:
|
134aa35
to
c22b56c
Compare
...-driver/src/main/java/gov/nasa/jpl/aerie/scheduler/simulation/ResumableSimulationDriver.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good so far! As mentioned during the walkthrough, I'd like to see two E2E tests to prove that "stale" simulation data will not be used.
The first would test the check on Plan Revision, by having a coexistence goal between two activities (ie for each GrowBanana, PeelBanana). The stale sim run will contain fewer GrowBananas than the "current" Plan at the time of scheduling, allowing the number of PeelBananas to indicate whether the stale data was used.
The second would test that the scheduler will not load a dataset that used a different sim config than it will. This can be done by have a coexistence goal on a resource (ie for each plant count < 5, GrowBanana) and then having that resource differ between when sim was run and when scheduling is run.
...er/src/main/java/gov/nasa/jpl/aerie/scheduler/conflicts/MissingActivityTemplateConflict.java
Show resolved
Hide resolved
scheduler-driver/src/main/java/gov/nasa/jpl/aerie/scheduler/goals/CardinalityGoal.java
Show resolved
Hide resolved
scheduler-driver/src/main/java/gov/nasa/jpl/aerie/scheduler/solver/PrioritySolver.java
Outdated
Show resolved
Hide resolved
scheduler-driver/src/main/java/gov/nasa/jpl/aerie/scheduler/solver/PrioritySolver.java
Outdated
Show resolved
Hide resolved
.../main/java/gov/nasa/jpl/aerie/scheduler/constraints/activities/ActivityCreationTemplate.java
Show resolved
Hide resolved
scheduler-server/src/main/java/gov/nasa/jpl/aerie/scheduler/server/graphql/GraphQLParsers.java
Outdated
Show resolved
Hide resolved
scheduler-server/src/main/java/gov/nasa/jpl/aerie/scheduler/server/graphql/GraphQLParsers.java
Outdated
Show resolved
Hide resolved
...-server/src/main/java/gov/nasa/jpl/aerie/scheduler/server/services/GraphQLMerlinService.java
Outdated
Show resolved
Hide resolved
...-server/src/main/java/gov/nasa/jpl/aerie/scheduler/server/services/GraphQLMerlinService.java
Outdated
Show resolved
Hide resolved
...ler-server/src/test/java/gov/nasa/jpl/aerie/scheduler/server/graphql/GraphQLParsersTest.java
Show resolved
Hide resolved
c22b56c
to
ec6a943
Compare
...-server/src/main/java/gov/nasa/jpl/aerie/scheduler/server/services/GraphQLMerlinService.java
Show resolved
Hide resolved
ec6a943
to
6e80bd7
Compare
@Mythicaeda I have added the 2 e2e-tests. |
8bcc056
to
7beb9d0
Compare
7beb9d0
to
46283e3
Compare
46283e3
to
58677fe
Compare
...-server/src/main/java/gov/nasa/jpl/aerie/scheduler/server/services/GraphQLMerlinService.java
Outdated
Show resolved
Hide resolved
58677fe
to
ae6bcb2
Compare
...-server/src/main/java/gov/nasa/jpl/aerie/scheduler/server/services/GraphQLMerlinService.java
Outdated
Show resolved
Hide resolved
ae6bcb2
to
143a2b9
Compare
143a2b9
to
40db5cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Excited to see this finally get in!
- Harmonize condition between SimulationFacade and ResumableSimulationDriver for restart - Initialize sim time at negative value to avoid useless simulation when an activity starts at Duration.ZERO. Also, avoid actually launching daemon task at initialization, wait for the real start of simulation. - Remove need for buffer in simulation results of scheduler The need for buffer was due to a bound checking in Windows.intoSpans failing if one of the bound not included. And, the last segment of all profiles (ending at the end of the simulation horizon) are always open. I have included the special case in the checking of bounds.
…ion in scheduler solving alg
Tests were running correctly but the proper call path was not taken
40db5cd
to
674837e
Compare
iterative
option for scheduling goals #913Description
As usual, on my way to the feature, I get distracted by other improvements and fixes. Some of those are really worth it, in commits (4) and (5) especially. The last commit shows a performance improvement in 20+ tests. About 30-50% less simulations in these test cases.
In (1), I removed the need for an ugly artificial time buffer in the scheduler simulation by fixing an issue in
Windows
: if the last interval of a profile was closed-open,Windows
considered that there was a gap because it only usedInterval.contains
and that contains does not care about the open/closed property. The detected gap prevented from turning these windows into spans and so on. Hopefully, that is now squared.In (2), I have added a small optimization that may spare us a simulation if not needed.
In (3), a small fix regarding the domain of converted simulation results.
In (4), a significant change. I removed the conflict detection after goal solving. After solving a goal, we ran conflict detection to check that we effectively solved it. This resulted in a simulation runs that could be avoided. The only goal which was taking advantage of this mechanism was the cardinality goal in the case of activities with uncontrollable duration. It was posting one conflict at a time and kept posting conflict until the duration and cardinality conditions were not satisfied. The
MissingActivityTemplateConflict
(and its processing byPrioritySolver
) has been generalized to handle duration and cardinality thus removing the need for the second conflict detection after goal solving.In (5), I fixed a boundary reset condition that were causing more simulations than needed. This bound had been set conservatively because at some point, the driver was dropping some tasks. But this has been fixed a few month ago.
In (6), I have modified the scheduler so that it is able to use initial simulation results coming from the DB.
SimulationFacade
to hold an initial plan until a simulation is needed. This is to avoid simulating a plan that will not be needed. Rootfinding needs the facade to have the current plan "loaded" but if external simulation results have been loaded, there is no need for a first simulation before the first rootfinding try.In (7) I pull the simulation results from the DB and pass them to scheduling. DB simulation results are considered useful if they correspond to the current revision of the plan and if they cover the entire plan horizon. We do not pull events and topics to save time (and they are useless to the scheduler right now).
In (8), I have fixed some tests. The results are the same but the data path as well as the in-memory state was wrong.
In (9) I have added a test to show how loading initial simulation results allows to remove the first simulation. The test is the same as its neighbor and displays a lower number of simulations (-1) for the same results.
In (10), I have updated the simulation performance of the scheduler tests. These improvements are mostly due to (4) (and a little bit (5)).
Verification
Existing tests are passing. Only one new test. Testing the graphql requests/scheduler-worker is not easy, I am open to suggestions.
Documentation
None.
Future work
None.