Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marine ens. var. #1353

Merged
merged 40 commits into from
Dec 3, 2024
Merged

Marine ens. var. #1353

merged 40 commits into from
Dec 3, 2024

Conversation

guillaumevernieres
Copy link
Contributor

@guillaumevernieres guillaumevernieres commented Oct 30, 2024

Companion PR to NOAA-EMC/global-workflow#3041

This PR needs an update to jcb-gdas (NOAA-EMC/jcb-gdas#46)

danholdaway added a commit to NOAA-EMC/jcb-gdas that referenced this pull request Oct 30, 2024
Companion PR to NOAA-EMC/GDASApp#1353

---------

Co-authored-by: Dan Holdaway <[email protected]>
@guillaumevernieres
Copy link
Contributor Author

Automated GW GDASApp Testing Results: Machine: hera

Start: Mon Dec  2 20:56:32 UTC 2024 on hfe11
---------------------------------------------------
Build:                                  *FAILED*
Build: Failed at Mon Dec  2 21:14:36 UTC 2024
Build: see output at /scratch1/NCEPDEV/da/role.jedipara/CI/GDASApp/workflow/PR/1353/global-workflow/sorc/log.build

Not quite sure why it returned "Build Failed", it's still building.

@guillaumevernieres
Copy link
Contributor Author

ctests started:
ctests-1353

@guillaumevernieres
Copy link
Contributor Author

guillaumevernieres commented Dec 3, 2024

The new batch of ctests is just not practical. I'm not even sure this is going to run overnight.
1 - we need to add dependencies to the ctests, if a test fail, the subsequent tests shouldn't run
2 - we should add labels specific to aero/atm/snow and marine. Our work is usually perpendicular to each-others, that pr for example should only run the marine DA stuff
3 - We shouldn't run the forecast after the DA, this will be tested again in the g-w tests ... debatable ...
4 - is it time to stage the model fcst from the 1/2 cycle on hpc's so we don't have to always run the fcst?

I'll create an issue tomorrow with what's above.

@RussTreadon-NOAA
Copy link
Contributor

Multiple ctests are failing / will fail because the expanded ctests suite now requires the GSI package (gsi.x and enkf.x) to be compiled. For example,

 31/134 Test #1964: test_gdasapp_C96C48_hybatmDA_gdas_anal_202112210000 ......................***Failed  210.87 sec

failed because gsi.x does not exist

+ exglobal_atmos_analysis.sh[934]: /bin/cp -p /scratch1/NCEPDEV/da/role.jedipara/CI/GDASApp/workflow/PR/1353/global-workflow/exec/gsi.x /scratch1/NCEPDEV/stmp2/role.jedipara/RUNDIRS/C96C48_hybatmDA/gdas.2021122100/anal.4045548
/bin/cp: cannot stat '/scratch1/NCEPDEV/da/role.jedipara/CI/GDASApp/workflow/PR/1353/global-workflow/exec/gsi.x': No such file or directory
+ exglobal_atmos_analysis.sh[1]: postamble exglobal_atmos_analysis.sh 1733177865 1
+ preamble.sh[70]: set +x
End exglobal_atmos_analysis.sh at 22:18:30 with error code 1 (time elapsed: 00:00:45)
+ JGLOBAL_ATMOS_ANALYSIS[1]: postamble JGLOBAL_ATMOS_ANALYSIS 1733177846 1

Script /scratch1/NCEPDEV/da/role.jedipara/CI/GDASApp/repo/ci/run_gw_ci.sh builds g-w using

rm -rf log.build
./build_all.sh -u &>> log.build
build_status=$?

-u builds GDASApp. This was fine when our g-w ctests did not include GSI-based DA. The expanded set of tests now includes GSI-based DA. We need to add -g to build GSI.

I changed /scratch1/NCEPDEV/da/role.jedipara/CI/GDASApp/repo/ci/run_gw_ci.sh to read

rm -rf log.build
./build_all.sh -ugk &>> log.build
build_status=$?

I added -k to kill all builds build in case any build fails.

@guillaumevernieres
Copy link
Contributor Author

Damn ... Never ending story. Of course the one liner change I decide not to test just failed.
To the reviewers: I'll run the marine DA related "ci" manually on msu tomorrow.

@guillaumevernieres
Copy link
Contributor Author

Multiple ctests are failing / will fail because the expanded ctests suite now requires the GSI package (gsi.x and enkf.x) to be compiled. For example,

 31/134 Test #1964: test_gdasapp_C96C48_hybatmDA_gdas_anal_202112210000 ......................***Failed  210.87 sec

failed because gsi.x does not exist

+ exglobal_atmos_analysis.sh[934]: /bin/cp -p /scratch1/NCEPDEV/da/role.jedipara/CI/GDASApp/workflow/PR/1353/global-workflow/exec/gsi.x /scratch1/NCEPDEV/stmp2/role.jedipara/RUNDIRS/C96C48_hybatmDA/gdas.2021122100/anal.4045548
/bin/cp: cannot stat '/scratch1/NCEPDEV/da/role.jedipara/CI/GDASApp/workflow/PR/1353/global-workflow/exec/gsi.x': No such file or directory
+ exglobal_atmos_analysis.sh[1]: postamble exglobal_atmos_analysis.sh 1733177865 1
+ preamble.sh[70]: set +x
End exglobal_atmos_analysis.sh at 22:18:30 with error code 1 (time elapsed: 00:00:45)
+ JGLOBAL_ATMOS_ANALYSIS[1]: postamble JGLOBAL_ATMOS_ANALYSIS 1733177846 1

Script /scratch1/NCEPDEV/da/role.jedipara/CI/GDASApp/repo/ci/run_gw_ci.sh builds g-w using

rm -rf log.build
./build_all.sh -u &>> log.build
build_status=$?

-u builds GDASApp. This was fine when our g-w ctests did not include GSI-based DA. The expanded set of tests now includes GSI-based DA. We need to add -g to build GSI.

I changed /scratch1/NCEPDEV/da/role.jedipara/CI/GDASApp/repo/ci/run_gw_ci.sh to read

rm -rf log.build
./build_all.sh -ugk &>> log.build
build_status=$?

I added -k to kill all builds build in case any build fails.

Thanks for digging into this @RussTreadon-NOAA .

@guillaumevernieres guillaumevernieres removed the hera-GW-RT-Failed Automated testing with global-workflow failed on Hera label Dec 3, 2024
@guillaumevernieres
Copy link
Contributor Author

@DavidNew-NOAA , @RussTreadon-NOAA , @CoryMartin-NOAA , @danholdaway , I'm encapsulating the new ctests inside of cmake options and I'm turning them off by default. For the time being, developers will have to manually turn on/off the tests that are relevant to the code changes. Not great but a simple way to move forward until we have a very low-res atmosphere.

@DavidNew-NOAA
Copy link
Collaborator

DavidNew-NOAA commented Dec 3, 2024

@guillaumevernieres What is the problem with having these tests on by default? If we do encapsulate them in CMake options, then we need a way to turn those options on at build time for nightly testing and PR testing

@guillaumevernieres
Copy link
Contributor Author

@guillaumevernieres What is the problem with having these tests on by default? If we do encapsulate them in CMake options, then we need a way to turn those options on at build time for nightly testing and PR testing

@DavidNew-NOAA , the tests are too slow, to the point that they are completely useless right now. It wasn't even 1/2 way through this am when I checked.
You can add the options to your cmake build to turn them on for "over-night" testing, but we shouldn't run all these for pr testing.

@guillaumevernieres
Copy link
Contributor Author

testing on my side:
ctests-1353
something fishy with the ens. fcst. rocoto job ... but the ens. forecast was successful.
Here's the rest:
more-ctests-1353

Copy link
Collaborator

@shlyaeva shlyaeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to reapprove all the previous changes (with the caveat of non expert review), but I have no opinion on the latest changes (encapsulating tests in the cmake options). Since it seems that it needs discussion and is not fully related to this PR one option would be to separate that change in a different PR (unless it can be discussed and decided in this PR in a reasonable time).

@guillaumevernieres
Copy link
Contributor Author

I am happy to reapprove all the previous changes (with the caveat of non expert review), but I have no opinion on the latest changes (encapsulating tests in the cmake options). Since it seems that it needs discussion and is not fully related to this PR one option would be to separate that change in a different PR (unless it can be discussed and decided in this PR in a reasonable time).

Good point. I'll add @DavidNew-NOAA to the PR review, see what he says first. As it is right now, we can't run the ci that we trigger through the github labels.

Copy link
Collaborator

@DavidNew-NOAA DavidNew-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most is fine by me. I'll just comment on the testing stuff:

  1. After some thought, I think it will be easier to only run a subset of tests at the command line rather than with CMake flags. For the run_gw_ci.sh script in GDASApp here, we run:

ctest -j${NTASKS_TESTS} -R gdasapp --output-on-failure &>> log.ctest

We can just run ctest -R C96C48_ufs_hybatmDA or ctest -R C48mx500_3DVarAOWCDA depending on which label one adds to a PR on GitHub.

It will be much easier for me to implement that than turning on those CMake flags at build time, because build.sh in GDASApp is called by build_gdas.sh in GW which is called by build_all.sh, so I'll have to had several different options tobuild_all.sh.

  1. I was meaning to add a test reference for the C384mx025 marinevar ctest, but I can do it in a separate PR. Up to you.

Copy link
Collaborator

@AndrewEichmann-NOAA AndrewEichmann-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections from inspection

@guillaumevernieres
Copy link
Contributor Author

Most is fine by me. I'll just comment on the testing stuff:

  1. After some thought, I think it will be easier to only run a subset of tests at the command line rather than with CMake flags. For the run_gw_ci.sh script in GDASApp here, we run:

ctest -j${NTASKS_TESTS} -R gdasapp --output-on-failure &>> log.ctest

We can just run ctest -R C96C48_ufs_hybatmDA or ctest -R C48mx500_3DVarAOWCDA depending on which label one adds to a PR on GitHub.

It will be much easier for me to implement that than turning on those CMake flags at build time, because build.sh in GDASApp is called by build_gdas.sh in GW which is called by build_all.sh, so I'll have to had several different options tobuild_all.sh.

  1. I was meaning to add a test reference for the C384mx025 marinevar ctest, but I can do it in a separate PR. Up to you.

We're not going to run these test on hera for this pr, so I can go back to running the test by default if that helps your nightly testing @DavidNew-NOAA . No need yet for a reference file for the c384.

Copy link
Collaborator

@KatherineLukens-NOAA KatherineLukens-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second Andy's comment. No objections here.

Copy link
Collaborator

@DavidNew-NOAA DavidNew-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @guillaumevernieres . I will work on specific CI labels for GitHub testing when Hera is back up

@guillaumevernieres guillaumevernieres merged commit d91663b into develop Dec 3, 2024
5 checks passed
@guillaumevernieres guillaumevernieres deleted the feature/marineenvar branch December 3, 2024 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants