Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename CCPP suites to non-meaningful names #843

Closed

Conversation

mkavulich
Copy link

Description

This PR renames CCPP suites to non-meaningful names, per discussions among CCPP and weather model developers. In this initial round, we have arbitrarily settled on bird names, and they were semi-randomly assigned as suggested by various colleagues and ChatGPT. The only non-random naming was that the names for corresponding suites in the Single Column Model repository were made identical: see that PR for more information.

Attempts were made to avoid confusing or ambiguous names, but suggestions are welcome as these names are not set in stone. The whole point of this renaming is that the suite names should be insignificant, only serving to disambiguate different physics configurations. Scientific and technical information about the suites should be derived from the suite file contents and other documentation.

The old names are mapped to the new names via a JSON format file alias.json for convenience.

Finally, a "DESCRIPTION" comment is added to all suites. I have tried to describe the provenance of each suite file to the best of my knowledge, but comments and corrections are welcome.

Issue(s) addressed

Testing

Have the ufs-weather-model regression test been run? On what platform?

  • Regression tests were run on Hera; all passed with no baseline changes as expected.

Dependencies

@mkavulich
Copy link
Author

Just checking in on this PR based on our conversations with some developers last week: we had decided we would continue discussing unresolved concerns on this PR. As a reminder, there is a detailed presentation available describing the reasoning behind these changes, available here: CCPP Suite Renaming slides

To hopefully spur on some continued discussion, I'll summarize what I believe is the main concern around this renaming proposal: that the loss of reference to the "application" (HAFS, GFS, RRFS, etc.) in filename leads to confusion about which suite is used for what purpose.

Our counter to this concern is several-fold:

  1. The description of the suite's use is best handled by documentation, both within the suite file and externally. Relying on the name of the suite to describe its purpose can never be complete (will always be ambiguous), and opens up potential problems when the use/purpose of a suite changes, or if it has multiple purposes.
  2. Related to point 1: Relying on the naming of a file to describe its purpose means that suites must be renamed constantly to have any hope of not being ambiguous or misleading. Historically this has not been done, and I would argue that it is not desirable: frequent renaming of suites is disruptive to workflows and back-compatibility.
  3. Suites should not be exclusive to one application; there should be nothing stopping, say for example, GFS and HAFS from using the same physics suite file, or an RRFS ensemble member making use of a GFS suite.
  4. Community users who want to know what suite is best for their experiment/forecast purposes should reference detailed documentation, not just rely on naming conventions.

We understand that this is a potentially disruptive change, at least until people adopt and get used to the new system. If this is still a concern holding up the adoption of this PR, or if there are more remaining concerns aside from the above, we are eager to address them and reach a compromise if necessary. We just don't want this PR to languish with no discussion or progress in the meantime.

@lisa-bengtsson
Copy link
Contributor

Hi Mike, I understand what you're trying to achieve and agree that the SDF name should not include configurations in the title. While I personally think that bird names are a bit far fetched and unrelated to what we are doing in numerical weather prediction, I don't have any better suggestions. A readable text file that describes which application(s) the suites are currently used for would be desirable. Thanks for letting us chime in.

@yangfanglin
Copy link
Collaborator

I also agree there is no perfect solution to address the issues. As I mentioned in the meeting, it is desirable to set up a web page or github docs, which is accessible to all, to document the meaning of each suite, its history, and applications that use or used the suite.

@mkavulich
Copy link
Author

mkavulich commented Jun 28, 2024

@yangfanglin This is still a work in progress, but I have added updates to the Users Guide in my weather model PR to include detailed information about all the suites. Is this the kind of documentation you were hoping for?

https://ufs-weather-model-mkavulich.readthedocs.io/en/latest/InputsOutputs.html#the-suite-definition-file-sdf-file

Edit: corrected link for renamed section: https://ufs-weather-model-mkavulich.readthedocs.io/en/latest/InputsOutputs.html#the-suite-definition-file-sdf

@yangfanglin
Copy link
Collaborator

@yangfanglin This is still a work in progress, but I have added updates to the Users Guide in my weather model PR to include detailed information about all the suites. Is this the kind of documentation you were hoping for?

https://ufs-weather-model-mkavulich.readthedocs.io/en/latest/InputsOutputs.html#the-suite-definition-file-sdf-file

Mike, Section 4.2.5. on The Suite Definition File looks great

@grantfirl
Copy link
Collaborator

grantfirl commented Jul 10, 2024

@junwang-noaa @lisa-bengtsson and all:
Given the constraints/concerns listed by @mkavulich in #843 (comment), we think that using somewhat random names for SDFs is the best solution that has been proposed to date. There is lots of precedent for other complex software (and even hardware) to resort to a similar solution. E.g., many OSes do this to refer to their versions and Intel/AMD etc. often name processors like this too. I think that it is ultimately a recognition that the software/hardware is too complex to boil down into a meaningful name, so, if you want a name that is human-referable, something random is less error-prone that trying to capture pieces of the underlying complexity (that could and does become actively misleading as soon as the underlying complexity changes) in the name.

Since one needs 3 pieces of information to fully describe what physics was run :

  1. the SDF (compile time group/order of schemes)
  2. the NML (runtime options)
  3. the commit hash (snapshot in time and/or development)

removing actively misleading names from the SDF is an improvement in the situation, as I see it.

I don't think that anyone is specifically attached to the bird names, but I think that any other topic chosen to organize the randomness of the names will likely have the same problem. Even using meteorological phenomena as names is still "random". I.e., there will be nothing about the suite that is related to "thunder" if we call the SDF that.

If we want to abandon the human-referable requirement, there are many naming schemes that have been suggested that produce a hard-to-read alphanumeric name that is more representative of what is inside the file, but since the file is a human-readable XML and since there is and will continue to be documentation about what SDFs are used for what applications, it is easy enough for users to either read the contents or documentation to orient themselves regarding the SDFs, IMO.

Any further thoughts on this? We need to decide to adopt this, something else, or maintain the status quo (which is still actively misleading and lacks proper documentation in most cases!).

@BinLiu-NOAA
Copy link
Collaborator

I recall during one of the telecom discussions, there is a specific concern about using the random suite file names (e.g., bird names), one will not easily know the connections between some related suites.

Here is what a developer/user see in the FV3/ccpp/suites directory with the new naming convention:

suites>ls -1 *.xml
albatross.xml
auklet.xml
bald_eagle.xml
bluebird.xml
canary.xml
chickadee.xml
condor.xml
crossbill.xml
crow.xml
dotterel.xml
dove.xml
egret.xml
falcon.xml
flamingo.xml
flycatcher.xml
heron.xml
hornbill.xml
hummingbird.xml
ibis.xml
kestrel.xml
kingfisher.xml
loon.xml
lorikeet.xml
magpie.xml
meadowlark.xml
parakeet.xml
pelican.xml
penguin.xml
pigeon.xml
ptarmigan.xml
puffin.xml
raven.xml
sandpiper.xml
shrike.xml
snowy_owl.xml
starling.xml
tanager.xml
tern.xml
toucan.xml
turnstone.xml
wren.xml

In contrast, here is the current suite files:

suites>ls -1 *.xml
suite_FV3_GFS_v15p2.xml
suite_FV3_GFS_v15_thompson_mynn_lam3km.xml
suite_FV3_GFS_v16_csawmg.xml
suite_FV3_GFS_v16_flake.xml
suite_FV3_GFS_v16_fv3wam.xml
suite_FV3_GFS_v16_ras.xml
suite_FV3_GFS_v16.xml
suite_FV3_GFS_v17_coupled_p8_c3.xml
suite_FV3_GFS_v17_coupled_p8_sfcocn.xml
suite_FV3_GFS_v17_coupled_p8_ugwpv1.xml
suite_FV3_GFS_v17_coupled_p8.xml
suite_FV3_GFS_v17_p8_c3.xml
suite_FV3_GFS_v17_p8_mynn.xml
suite_FV3_GFS_v17_p8_rrtmgp.xml
suite_FV3_GFS_v17_p8_ugwpv1.xml
suite_FV3_GFS_v17_p8.xml
suite_FV3_global_nest_v1.xml
suite_FV3_HAFS_v1_gfdlmp_tedmf_nonsst.xml
suite_FV3_HAFS_v1_gfdlmp_tedmf.xml
suite_FV3_HAFS_v1_thompson_nonsst.xml
suite_FV3_HAFS_v1_thompson_tedmf_gfdlsf.xml
suite_FV3_HAFS_v1_thompson.xml
suite_FV3_HRRR_c3.xml
suite_FV3_HRRR_gf_nogwd.xml
suite_FV3_HRRR_gf.xml
suite_FV3_HRRR.xml
suite_FV3_RAP_cires_ugwp.xml
suite_FV3_RAP_clm_lake.xml
suite_FV3_RAP_flake.xml
suite_FV3_RAP_noah_sfcdiff_cires_ugwp.xml
suite_FV3_RAP_noah.xml
suite_FV3_RAP_sfcdiff.xml
suite_FV3_RAP_unified_ugwp.xml
suite_FV3_RAP.xml
suite_FV3_RRFS_v1beta.xml
suite_FV3_RRFS_v1nssl.xml
suite_FV3_WoFS_v0.xml
suite_RRFSens_phy1.xml
suite_RRFSens_phy2.xml
suite_RRFSens_phy3.xml
suite_RRFSens_phy4.xml
suite_RRFSens_phy5.xml

From the readability wise, I think the original filenames at least provide some useful information (potential contents and relationship between some of the similar suites) by simply seeing the file names. However, it is not easy to get useful information by seeing the random naming convention.

Even if just taking out the "suite_*_" prefix and converting to lowercase:

suites>ls -1 *.xml | sed -e 's/suite_FV3_//g' -e 's/suite_//g' | tr 'A-Z' 'a-z'
gfs_v15p2.xml
gfs_v15_thompson_mynn_lam3km.xml
gfs_v16_csawmg.xml
gfs_v16_flake.xml
gfs_v16_fv3wam.xml
gfs_v16_ras.xml
gfs_v16.xml
gfs_v17_coupled_p8_c3.xml
gfs_v17_coupled_p8_sfcocn.xml
gfs_v17_coupled_p8_ugwpv1.xml
gfs_v17_coupled_p8.xml
gfs_v17_p8_c3.xml
gfs_v17_p8_mynn.xml
gfs_v17_p8_rrtmgp.xml
gfs_v17_p8_ugwpv1.xml
gfs_v17_p8.xml
global_nest_v1.xml
hafs_v1_gfdlmp_tedmf_nonsst.xml
hafs_v1_gfdlmp_tedmf.xml
hafs_v1_thompson_nonsst.xml
hafs_v1_thompson_tedmf_gfdlsf.xml
hafs_v1_thompson.xml
hrrr_c3.xml
hrrr_gf_nogwd.xml
hrrr_gf.xml
hrrr.xml
rap_cires_ugwp.xml
rap_clm_lake.xml
rap_flake.xml
rap_noah_sfcdiff_cires_ugwp.xml
rap_noah.xml
rap_sfcdiff.xml
rap_unified_ugwp.xml
rap.xml
rrfs_v1beta.xml
rrfs_v1nssl.xml
wofs_v0.xml
rrfsens_phy1.xml
rrfsens_phy2.xml
rrfsens_phy3.xml
rrfsens_phy4.xml
rrfsens_phy5.xml

still seems better in terms of readability and getting relationships between suites.

If some further simplification is preferred, doing something like below can be considered:

gfs_v15_a.xml
gfs_v15_b.xml
gfs_v16_a.xml
gfs_v16_b.xml
gfs_v16_c.xml
gfs_v16_d.xml
gfs_v16.xml
gfs_v17_a.xml
gfs_v17_ac.xml
gfs_v17_b.xml
gfs_v17_bc.xml
gfs_v17_cc.xml
gfs_v17_d.xml
gfs_v17_dc.xml
hafs_v1_a.xml
hafs_v1_ac.xml
hafs_v1_b.xml
hafs_v1_bc.xml
hrrr_a.xml
hrrr_b.xml
hrrr_c.xml
hrrr_d.xml
rap_a.xml
rap_b.xml
rap_c.xml
rap_d.xml
rap_e.xml
rap_f.xml
rap_g.xml
rap_h.xml
rrfs_v1_a.xml
rrfs_v1_b.xml
wofs_v0_a.xml
rrfs_ens1.xml
rrfs_ens2.xml
rrfs_ens3.xml
rrfs_ens4.xml
rrfs_ens5.xml

Or further simplifying into the following might can be considered if desired:

global_1.xml
global_2.xml
global_3.xml
global_4.xml
global_5.xml
global_5c.xml
global_6.xml
global_6c.xml
...
regional_1.xml
regional_2.xml
regional_3.xml
regional_4.xml
regional_5.xml
...
hurricane_1.xml
hurricane_1c.xml
hurricane_2.xml
hurricane_2c.xml
...

Again, these are just my own opinion and some ideas/suggestions to be considered (may or may not be useful though)

Also, agreed that the documentation like (https://ufs-weather-model-mkavulich.readthedocs.io/en/latest/InputsOutputs.html#the-suite-definition-file-sdf) definitely helps. Meanwhile, it might also be useful to add a README (or README.MD) file, describing all the suites inside this FV3/ccpp/suites directory. This will be beneficial for users/developers to easily find the description (no need to go to an external webpage/link). Meanwhile easier to make sure the descriptions in the README.MD being consistent with the suite files in the current dir (since they belong to the same repository).

@grantfirl
Copy link
Collaborator

@mkavulich Were you going to move this to draft mode or otherwise close this to revisit later?

@mkavulich
Copy link
Author

@grantfirl I don't see a way to revert this to draft mode, so I will close it for now and re-open in the future.

@mkavulich mkavulich closed this Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change CCPP physics suite names to follow new CCPP guidelines
5 participants