-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI tests run time with IoT docker images #20
Comments
A separate workflow can be created to update TeX Live cache regularly. Perhaps off-topic, BTW, action |
@muzimuzhi Thank you for your thoughts!
These would presumably reduce further the run time for the Btw, I did previously use (some) caching for that, and it was dropped in that PR of yours. Do you remember the reason why you dropped it?
I also don't see how
Here I'm not convinced this is something to worry about. At least given how packages are distributed in the ecosystem. Any regular user who has access to a current version of I had a similar discussion some time ago with daleif, related to how to decide on the required kernel version when we want to use some new feature. All in all, I fail to see the need to support and test for an "old TeX distribution with a current |
Mhm, I tested again |
Starting with v2, But GitHub only retains caches for 7 days, hence to make sure there're always a hit cache to restore, you may need another workflow running on a regular basis, triggered by As you already applied, latex3/latex2e uses a different strategy since its large test suite is split to 33 jobs. Note the root, first job "Update TeX Live" (or "Install TeX Live" in https://github.com/gusbrs/zref-clever/actions/runs/6816911621) may still take the time to do a clean installation if no cache is hit. Therefore 6m25s is the best case, not the average case. You can emulate the worst case by manually deleting the caches listed in https://github.com/gusbrs/zref-clever/actions/caches and then triggering a rerun.
My guess is yes, since "Regression tests" and "Documentation" jobs are run in parallel. Seems it's on GitLab's server, see https://gitlab.com/islandoftex/images/texlive/container_registry. The GitLab Container Registry is free for everyone (all tiers, all offerings).
I meant, |
That's what I get for approaching the problem in trial-and-error fashion, instead of actually studying how things work. ;-) Thanks for pointing that out. I might revert that commit then. Though handling the cache explicitly like that does have one benefit, which is the costly installation step is done only once for both checks and doc. Anyway, I'll test further.
Yes, I understand, of course, that that time depended on the cache being already there. But the "worst case" is still faster. And if I'm doing a batch of work in the repo I get the benefit of the cache on top of that. This seems like good enough, I'm not sure running a scheduled job just to keep the cache alive is granted for this (for my needs, that is). I do think the use of the docker image is conceptually superior, it is arguably a more thorough and proper solution. The main practical benefit is not having to manually curate a list of packages to install, which is a pain, since the way to do it is through reiterated (failed) attempts to run the workflow on GitHub. But, once that's settled, it works too. And the current long CI run times have been bugging me. Actually, another thing I'm considering is dropping the GitHub workflow altogether. When I added it, I did so because it seemed cool, and I wanted to learn about it. But, in hindsight, I've been getting little benefit from it. I tend to run the tests locally anyway. For releases, I must do so to prepare the package with
That's about what I suspected. Thanks for the explaination. GitLab's servers than, makes sense. Better than burdening IoT's ones, but still.
Then I'm afraid I had missed your point in this, and still do. Care to elaborate this a little further? |
Sorry I missed this one in my last comment. Currently all 79+1 tests were run in a single job named "Regression tests", but it don't have to. For the 79 tests in
to check the tests in two steps, at the cost of running the nth test twice. Then the workflow can be set to start two parallel jobs for regression tests. jobs:
check:
strategy:
matrix:
include:
- id: 1
command: |
l3build check -c build --last zc-class-scrreprt01
- id: 2
command: |
l3build check -c build -q --first zc-class-scrreprt01
l3build check -c build-4runs -q
name: Regression tests (${{ matrix.id }})
runs-on: ubuntu-latest
steps:
- name: Install TeX Live
uses: zauguin/install-texlive@v3
with:
packages: ${{ env.ZC_PACKAGE_LIST }}
- name: Checkout repository
uses: actions/checkout@v4
- name: Run tests
run: ${{ matrix.command }}
- name: Archive failed test output
# ... This provides another way to speed up |
Another benefit: by setting a scheduled workflow run, it's easier to catch compatibility problems and uncover the need to update test files ( For example CTeX-org/ctex-kit is relatively stable and its test suite compares output of
It's already outdated/useless. In #15 (comment) I speculated that
To check against it, one way is to set the workflow to install TeX Live 2022 then... |
Interesting, thank you very much once again. This technique would also apply to the IoT docker images. Nice, but I think it is well beyond my "GitHub-actions-fu" so I'd be weary to add this complexity to the task. Still I'm happy to have this alternative as a reference here. Also, I reduced further the run times by dropping testing on dev formats. Now we're at ~4min cached / ~8min no cache. Much better, and acceptable. Btw, TIL that my regression tests were taking as much time as those of the LaTeX kernel... |
That's a good point. I had been doing this kind of check to catch upstream changes locally. But, as long as do keep the CI workflow, you have me convinced on running things on schedule too.
Ah! I finally understood what you meant. I had totally lost this train of thought. Though I think this would be a little too much for this check. All in all, and again, thank you very very much for your comments! Insightful and useful as usual. |
I think this is a fair time to close this one. I'll be glad to reopen it, though, if any other thoughts or ideas come up. |
Following discussion at #15 (comment)
Since
zref-clever
moved to using the IoT TeX Live docker image for the CI tests GitHub workflow (e1d2acf) my impression is that the workflow run was taking considerably more time to run than before. While I recalled workflow runs to take about 10-14 min, which is already considerable, with the use of the docker images, they were easily breaching the 20+ min mark.It is true that workflows' run times tend to vary a lot, but at @muzimuzhi 's suggestion, I triggered workflows with IoT docker images and without it (using zauguin/install-texlive, as I used to before). Also true, this is just a one point sample for comparison, but better than nothing and better than my "impression".
Run with IoT docker image (https://github.com/gusbrs/zref-clever/actions/runs/6811932906):
Initialize containers + Update TeX Live: 1m27s + 30s
Run tests: 13m 52s
Total time: 15m56s
Run with zauguin/install-texlive (https://github.com/gusbrs/zref-clever/actions/runs/6812269840):
Install TeX Live: 4m24s
Run tests: 8m48s
Total time: 13m21s
So the overall difference seems to be smaller than granted by my previous impression. However, there really seems to be a difference in performance. I took more than twice as much to install TeX Live from scratch than to initialize the docker containers and update TeX Live. Despite that, the zauguin/install-texlive run was still faster overall, because the core "Run tests" task, which is exactly the same in both cases, and which is the most time consuming runs much faster in this case.
True, just 2.5min difference is hardly enough to revert to the previous approach, and to loose the convenience granted by the IoT docker image. But, as mentioned, this is just one point sample. I'll be keeping an eye on these run times, and if they remain being considerable, I may reconsider the use of the docker image.
In the meantime, if anyone has any ideas of why this performance difference occurs, and if there are ways to improve things, it'd be much appreciated.
Edit: These two workflow runs, from when the use of the IoT docker image was introduced, are from the same day and also comparable:
With IoT docker image (https://github.com/gusbrs/zref-clever/actions/runs/4212257093): 14m39s
With zauguin/install-texlive (https://github.com/gusbrs/zref-clever/actions/runs/4212132743): 8m27s
(But the logs are gone and we can no longer compare the parts).
The text was updated successfully, but these errors were encountered: