-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DQM test TestDQMGUIUpload
times out
#46682
Comments
cms-bot internal usage |
A new Issue was created by @nothingface0. @Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign dqm |
New categories assigned: dqm @antoniovagnerini,@rseidita you have been requested to review this Pull request/Issue and eventually sign? Thanks |
While debugging, we faced another issue and had to restart the dev DQMGUI, which led to another issue appearing. We are investigating. Cms-talk post here |
@nothingface0 , any idea why only the unit test fail while the dqm bin-by-bin comparison works [a]. dqm bin-bin comparison also uses visDQMUpload.py to upload many root files to [a]
|
@smuzaffar Regarding the bin-by-bin comparison, from what I understand it's done locally, where the test is running, and then the results are uploaded. In the script you link, there's no validation that the upload itself worked, e.g. by checking the GUI after the upload finished: it's just comparing and uploading. |
the unit test is failing at the time of upload [a] in visDQMUpload.py ... right ? And this upload is working for DQM bin-by-bin otherwise we should have seen [a]
|
I.e., when the cmsbot is running the test for PRs or IBs, where `CMSBOT_CI_TESTS` is set. Also fix the timestamp of the renamed file to include the hour, which was missing before, it will help with debugging. See also: cms-sw#46682
I.e., when the cmsbot is running the test for PRs or IBs, where `CMSBOT_CI_TESTS` is set. - Also fix the timestamp of the renamed file to include the hour, which was missing before, it will help with debugging. - Added a `--help` argument, just in case. See also: cms-sw#46682
I.e., when the cmsbot is running the test for PRs or IBs, where `CMSBOT_CI_TESTS` is set. - Also fix the timestamp of the renamed file to include the hour, which was missing before, it will help with debugging. - Added a `--help` argument, just in case. See also: cms-sw#46682
Taking this failed test as an example, judging from the logs I found in DQMGUI and the test's logs:
Takeaway points:
|
Another instance of the failure here.
On the other hand, for this successful test:
|
How about we disable this test for PRs/IBs. We run it as a special test for each IB ( just like we run tests for crab and hlt) and there we can increase the wait time to few hours (we can run it on lxplus so it will not waste our build resources). If it does not get the processing after let say 6 hours then we can mark it failed? |
I didn't know there was such an option, sounds good to me! Let me know if any modifications are required for the test. |
The recently added
TestDQMGUIUpload
(#46551) has shown to fail even after 10 minutes of waiting, for recent PR tests and an IB:After checking the logs of the target DQMGUI, the first impression I get is that during periods of heavy dev DQMGUI activity (upload of tier0 replays, PR root files), it looks like it might take a significant amount of time for the file uploaded by the test to be properly registered, meaning that the test fails. If this is the only problem of the test, we could increase the max waiting time.
Unfortunately, I forgot to add
%H
in the timestamp that is added to the file, so I don't know exactly how much time it takes the DQMGUI to discover each uploaded file, since I only know what time it arrived and was imported, but not when the test started.I will keep this issue updated as I investigate from the DQM side.
The text was updated successfully, but these errors were encountered: