Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use HPC-stack libraries on WCOSS2 #975

Merged
merged 10 commits into from
Aug 21, 2024

Conversation

GeorgeGayno-NOAA
Copy link
Collaborator

@GeorgeGayno-NOAA GeorgeGayno-NOAA commented Aug 12, 2024

DESCRIPTION OF CHANGES:

Update to HPC-stack on WCOSS2. Newer versions of these libraries are now used:

  • HDF5 - from 1.10.6 to 1.14.0.
  • NetCDF - from 4.7.4 to 4.92.
  • nemsio - from 2.5.2 to 2.5.4.
  • splib - from 2.3.3 to 2.4.0.
  • ESMF - from 8.4.1 to 8.6.0.

Use the stack version of the nccmp utility in the regression tests instead of my own personal copy.

TESTS CONDUCTED:

  • Compile branch on all Tier 1 machines using Intel (Orion, Jet, Hera, Hercules and WCOSS2). Done using 6f42420.
  • Compile branch in 'Debug' mode on WCOSS2. Done using 6f42420.
  • Run unit tests locally on any Tier 1 machine. Done on WCOSS2 using 6f42420.
  • Run relevant consistency tests.
  • The chgres_cube tests were run on Hercules, Hera, Orion and Jet using 6f42420. All passed as expected.
  • The global_cycle, grid_gen, weight_gen, snow2mdl, ocnice_prep and ice_blend tests were run on WCOSS2 (using 6f42420). All tests passed.
  • On WCOSS2, chgres_cube consistency tests 1/2/3/4/11 and 13 failed (using 6f42420). Differences were noted in the wind fields. However, they were insignificant and likely due to the switch to ESMF 8.6.0. For more details, see: Use HPC-stack libraries on WCOSS2 #975 (comment)
  • The cpld_gridgen and ocnice_prep tests were run on Hera, Hercules, Orion and Jet (using 6f42420). All passed as expected.
  • On WCOSS2, the cpld_gridgen tests failed (using 6f42420), but the differences from the baseline were small and do not indicate a problem. See: Use HPC-stack libraries on WCOSS2 #975 (comment)

DEPENDENCIES:

NOAA-EMC/global-workflow#2106

DOCUMENTATION:

N/A

ISSUE:

Fixes #877.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@DavidHuber-NOAA - would you like to do a quick test in the workflow?

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@DeniseWorthen - on WCOSS2, UFS_UTILS now points to updated libraries. In particular, it now uses ESMF 8.6.0. I think this is why your cpld_gridgen regression test is now failing (using 6f42420). The differences look insignificant to me. But could you please check? The log files are on Cactus here: /lfs/h2/emc/global/noscrub/George.Gayno/ufs_utils.git/UFS_UTILS/reg_tests/cpld_gridgen

@DavidHuber-NOAA
Copy link
Collaborator

@GeorgeGayno-NOAA Sure, will do.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

On WCOSS2, some chgres_cube consistency tests failed when using 6f42420. All differences from the baseline were in the wind fields. All differences were insignificant. A representative example from test number 2/tile2 is:

Variable Group Count          Sum      AbsSum          Min         Max       Range         Mean      StdDev
u_w      /         5 -8.49366e-07 1.35601e-06 -9.53674e-07 2.38419e-07 1.19209e-06 -1.69873e-07 4.57541e-07
v_w      /         9  2.19792e-07 5.47618e-07 -1.19209e-07 2.38419e-07 3.57628e-07  2.44213e-08 9.51794e-08

@DeniseWorthen
Copy link
Contributor

@DeniseWorthen - on WCOSS2, UFS_UTILS now points to updated libraries. In particular, it now uses ESMF 8.6.0. I think this is why your cpld_gridgen regression test is now failing (using 6f42420). The differences look insignificant to me. But could you please check? The log files are on Cactus here: /lfs/h2/emc/global/noscrub/George.Gayno/ufs_utils.git/UFS_UTILS/reg_tests/cpld_gridgen

Hi George, These look fine. Thanks for checking. All the fields which are reported different have physical ranges >> than the differences so they appear to be just roundoff-level differences.

@GeorgeGayno-NOAA GeorgeGayno-NOAA marked this pull request as ready for review August 12, 2024 18:12
@DavidHuber-NOAA
Copy link
Collaborator

@GeorgeGayno-NOAA I tested this PR in the global workflow and compared against the current develop version. I found an unexpected result in the develop version that I do not know how to explain. The half-cycle, 1-hour GDAS forecast produces spurious and fill-valued snow accumulations. However, when using the updated UFS_Utils hash, the snow accumulation values appear to be reasonable.

As far as I know, the first half-cycle does not make use of any UFS_Utils executables and the UFS weather model is not cross-linked with UFS_Utils. Is that correct?

The output 1-hour GRIB2 files can be found here:

develop: /lfs/h2/emc/global/noscrub/david.huber/para/COMROOT/utils_dev/gdas.20211220/18/model_data/atmos/master/gdas.t18z.master.grb2f001
updated ufs_utils: /lfs/h2/emc/global/noscrub/david.huber/para/COMROOT/utils_hpc-stack/gdas.20211220/18/model_data/atmos/master/gdas.t18z.master.grb2f001

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@GeorgeGayno-NOAA I tested this PR in the global workflow and compared against the current develop version. I found an unexpected result in the develop version that I do not know how to explain. The half-cycle, 1-hour GDAS forecast produces spurious and fill-valued snow accumulations. However, when using the updated UFS_Utils hash, the snow accumulation values appear to be reasonable.

As far as I know, the first half-cycle does not make use of any UFS_Utils executables and the UFS weather model is not cross-linked with UFS_Utils. Is that correct?

The output 1-hour GRIB2 files can be found here:

develop: /lfs/h2/emc/global/noscrub/david.huber/para/COMROOT/utils_dev/gdas.20211220/18/model_data/atmos/master/gdas.t18z.master.grb2f001
updated ufs_utils: /lfs/h2/emc/global/noscrub/david.huber/para/COMROOT/utils_hpc-stack/gdas.20211220/18/model_data/atmos/master/gdas.t18z.master.grb2f001

Correct. UFS_UTILS is not cross-linked with the UFS weather model.

Are the 'anl' or 00-hr forecast files the same? If so, then I can't explain why the 01-hr forecast would be different.

@DavidHuber-NOAA
Copy link
Collaborator

@GeorgeGayno-NOAA Yes, the 00-hr forecast files are the same. Also, the 1-hour (and all hours thereafter) netCDF outputs are identical between the two runs. It is only the GRIB2 files that differ. It seems this is a reproducibility issue in the inline UPP. I will report this and carry on with my comparison.

@DavidHuber-NOAA
Copy link
Collaborator

All non-GRIB2 files are identical for the first two full cycles. The only exception are the sfcanl netCDF files generated by global_cycle. The data within these files are identical, but the HDF5 header differs. This is an expected result when moving from HDF5 v1.10.6 to v1.14.0 and was observed on other platforms.

@GeorgeGayno-NOAA GeorgeGayno-NOAA merged commit cf2da76 into ufs-community:develop Aug 21, 2024
4 checks passed
DavidHuber-NOAA added a commit to DavidHuber-NOAA/UFS_UTILS that referenced this pull request Sep 9, 2024
* origin/develop:
  Update the C192 default ocean resolution in the gdas_init utility (ufs-community#980)
  Use HPC-stack libraries on WCOSS2 (ufs-community#975)
  Fix compiler warning in fre-nctools.fd (ufs-community#969)
  Fix Gnu compilation on Hera (ufs-community#965)
  Update fixed data directory path for Gaea (ufs-community#972)
@GeorgeGayno-NOAA GeorgeGayno-NOAA deleted the wcoss2_stack branch September 25, 2024 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update WCOSS2 Modulefiles
3 participants