Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive shrinking times in STM Weak test parallel #498

Closed
jmid opened this issue Dec 20, 2024 · 0 comments · Fixed by #499
Closed

Excessive shrinking times in STM Weak test parallel #498

jmid opened this issue Dec 20, 2024 · 0 comments · Fixed by #499
Labels
test suite reliability Issue concerns tests that should behave more predictably

Comments

@jmid
Copy link
Collaborator

jmid commented Dec 20, 2024

On the merge to main of #463 we are seeing a timeout of the 5.3 bytecode workflow caused by excessive shrinking during STM Weak test parallel:
https://github.com/ocaml-multicore/multicoretests/actions/runs/12426416197/job/34694657567

random seed: 129649292
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Weak test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Weak test sequential (generating)
[✓] 1000    0    0 1000 / 1000     0.1s STM Weak test sequential
================================================================================
success (ran 1 tests)
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 5000     0.0s STM Weak test parallel
[ ]  102    0    0  102 / 5000   211.6s STM Weak test parallel (shrinking:    0.0004)
[ ]  102    0    0  102 / 5000   325.1s STM Weak test parallel (shrinking:    0.0006)
[ ]  102    0    0  102 / 5000   435.3s STM Weak test parallel (shrinking:    0.0008)
[...]
[ ]  102    0    0  102 / 5000  3487.1s STM Weak test parallel (shrinking:    0.0071)
[ ]  102    0    0  102 / 5000  3600.6s STM Weak test parallel (shrinking:    0.0073)
[ ]  102    0    0  102 / 5000  3713.3s STM Weak test parallel (shrinking:    0.0075)
[ ]  102    0    0  102 / 5000  3810.9s STM Weak test parallel (shrinking:    1)
[ ]  102    0    0  102 / 5000  3872.1s STM Weak test parallel (shrinking:    1.0002)
[ ]  102    0    0  102 / 5000  3982.7s STM Weak test parallel (shrinking:    1.0004)
[ ]  102    0    0  102 / 5000  4095.1s STM Weak test parallel (shrinking:    1.0006)
[...]
[ ]  102    0    0  102 / 5000  8881.7s STM Weak test parallel (shrinking:    1.0101)
[ ]  102    0    0  102 / 5000  8993.8s STM Weak test parallel (shrinking:    1.0103)
[ ]  102    0    0  102 / 5000  9105.7s STM Weak test parallel (shrinking:    1.0105)
[ ]  102    0    0  102 / 5000  9166.9s STM Weak test parallel (shrinking:    2)
[ ]  102    0    0  102 / 5000  9227.6s STM Weak test parallel (shrinking:    2.0002)
[ ]  102    0    0  102 / 5000  9340.0s STM Weak test parallel (shrinking:    2.0004)
[ ]  102    0    0  102 / 5000  9451.8s STM Weak test parallel (shrinking:    2.0006)
[ ]  102    0    0  102 / 5000  9513.1s STM Weak test parallel (shrinking:    3)
[ ]  102    0    0  102 / 5000  9573.7s STM Weak test parallel (shrinking:    3.0002)
[ ]  102    0    0  102 / 5000  9684.3s STM Weak test parallel (shrinking:    3.0004)
[ ]  102    0    0  102 / 5000  9684.3s STM Weak test parallel (shrinking:    3.0004)
[ ]  102    0    0  102 / 5000  9796.2s STM Weak test parallel (shrinking:    3.0006)
[ ]  102    0    0  102 / 5000  9876.5s STM Weak test parallel (shrinking:    3.0008)
[ ]  102    0    0  102 / 5000  9951.4s STM Weak test parallel (shrinking:    3.0013)
[ ]  102    0    0  102 / 5000 10014.2s STM Weak test parallel (shrinking:    3.0017)
[ ]  102    0    0  102 / 5000 10085.4s STM Weak test parallel (shrinking:    3.0020)
[ ]  102    0    0  102 / 5000 10182.7s STM Weak test parallel (shrinking:    4.0002)
[ ]  102    0    0  102 / 5000 10294.1s STM Weak test parallel (shrinking:    4.0004)
[ ]  102    0    0  102 / 5000 10405.1s STM Weak test parallel (shrinking:    4.0006)
Error: The operation was canceled.

It is taking

  • ~211s and 102 samples to find a counterexample
    But then takes
  • ~3600s (3811-211) to shrink it successfully one step,
  • ~5357a (9167-3810) to shrink it successfully a second step
  • ~346s (9513-9167) to shrink it successfully a third step
  • ...

Note that 10000s corresponds to ~170m, or 2h50m.

The Weak module is challenged by its state being fragile to GC invocations, making reproducability challenging - also during shrinking. We should avoid spending time on needless reruns to reduce simple integer arguments under these circumstances:

let shrink_cmd c = match c with
| Length -> Iter.empty
| Set (i, d_opt) -> Iter.map (fun i -> Set (i,d_opt)) (Shrink.int i)
| Get i -> Iter.map (fun i -> Get i) (Shrink.int i)
| Get_copy i -> Iter.map (fun i -> Get_copy i) (Shrink.int i)
| Check i -> Iter.map (fun i -> Check i) (Shrink.int i)
| Fill (i,j,d_opt) ->
Iter.(map (fun i -> Fill (i,j,d_opt)) (Shrink.int i)
<+>
map (fun j -> Fill (i,j,d_opt)) (Shrink.int j))

and also consider running each Weak test in a separate test executable, like #469 now does, rather than the current hoop jumping:
(* Beware: hoop jumping to enable a full major Gc run between the two tests!
We need that to avoid the state of the second test depending on the resulting
GC state of the first test and don't want to exit after the first run
(as QCheck_base_runner.run_tests_main does). *)

@jmid jmid added the test suite reliability Issue concerns tests that should behave more predictably label Dec 20, 2024
@jmid jmid closed this as completed in #499 Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test suite reliability Issue concerns tests that should behave more predictably
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant