Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new Feature/timestep #87

Merged
merged 40 commits into from
Sep 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
33c33af
Initial commit of feature/timestep branch.
drnimbusrain Aug 4, 2023
d90d616
Merge branch 'develop' into feature/timestep
drnimbusrain Aug 4, 2023
bd23c65
Merge branch 'develop' into feature/timestep
drnimbusrain Aug 5, 2023
b97dcc6
Updated time step feature and new canopy_close_files.
drnimbusrain Aug 5, 2023
a1f329d
Changes to canopy_app structure and fixed firsttime issue.
drnimbusrain Aug 22, 2023
d2b2b39
Added time attributes to output file.
drnimbusrain Aug 25, 2023
8a8b6d5
Updated multiple timesteps, remove closefiles, other changes.
drnimbusrain Aug 29, 2023
30672d9
Added forgotten date_mod codes.
drnimbusrain Aug 29, 2023
ec5bffb
Added timestamps to nc and txt file outputs.
drnimbusrain Aug 29, 2023
4a4828b
Added point example to namelist for completeness.
drnimbusrain Aug 29, 2023
5e9cd29
Reverting default NL to single file for CI test.
drnimbusrain Sep 13, 2023
8e93bec
Updated python script due to txt output file name changes.
drnimbusrain Sep 13, 2023
cb42caf
Added new time stamp suffix to txt output for python script.
drnimbusrain Sep 13, 2023
9584e5b
Timed text files
zmoon Sep 13, 2023
29201c2
Timed nc files
zmoon Sep 13, 2023
7ddc2e5
Revised time step/stamp outputs for TXT and NC.
drnimbusrain Sep 14, 2023
36ff7fd
Updated README
drnimbusrain Sep 14, 2023
16f5e8c
Updated README
drnimbusrain Sep 14, 2023
b9eccfd
Fixed write_txt error for missing argument.
drnimbusrain Sep 14, 2023
205fe59
Changed python/canopy_app.py for new text headers.
drnimbusrain Sep 14, 2023
c018fa0
Trying this way.
drnimbusrain Sep 14, 2023
6da62be
Updated skiprows.
drnimbusrain Sep 14, 2023
47a9ad9
Changed time units to "seconds since"
drnimbusrain Sep 14, 2023
27c86fd
Read time stamp from text files
zmoon Sep 14, 2023
7bd7323
Fix nc reading for now
zmoon Sep 14, 2023
9c374a0
Merge branch 'feature/timestep' of https://github.com/noaa-oar-arl/ca…
zmoon Sep 14, 2023
197357e
Back to decoding nc times
zmoon Sep 14, 2023
095181d
Fixed output time.
drnimbusrain Sep 14, 2023
33e9068
Merge branch 'feature/timestep' of https://github.com/noaa-oar-arl/ca…
drnimbusrain Sep 14, 2023
ce6db12
Changed NL back to 3 times and added point example.
drnimbusrain Sep 22, 2023
2cd881b
Keep default case in CI as single file
zmoon Sep 22, 2023
e5f1979
Fix setup for multiple input files
zmoon Sep 22, 2023
8d126a2
Deal with multiple times in output txt files
zmoon Sep 22, 2023
cdb809a
ntime
zmoon Sep 22, 2023
bb4396d
Updated point files to change vtype and fix fch.
drnimbusrain Sep 22, 2023
cfd14c6
Merge branch 'feature/timestep' of https://github.com/noaa-oar-arl/ca…
drnimbusrain Sep 22, 2023
ed78f30
Replace fill value with NaN when reading txt outputs
zmoon Sep 22, 2023
2a3e662
Fix examples
zmoon Sep 22, 2023
074b5a3
Updated SE txt and nc files for LAI updates from Wei-Ting.
drnimbusrain Sep 24, 2023
daaec36
Removed timestamp from updated default SE txt files.
drnimbusrain Sep 25, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ jobs:

- name: Check that default input is nc
run: |
f90nml -g filenames -v file_vars="'input/gfs.t12z.20220701.sfcf000.canopy.nc'" \
input/namelist.canopy input/namelist.canopy
f90nml -g userdefs -v ntime=1 \
input/namelist.canopy input/namelist.canopy
python -c '
import f90nml
with open("input/namelist.canopy") as f:
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ Namelist Option : `file_vars` Full name of input file (Supports either text or

- See example file inputs for variables and format (`gfs.t12z.20220701.sfcf000.canopy.txt` or `gfs.t12z.20220701.sfcf000.canopy.nc`). Example surface met/land/soil inputs are based on NOAA's UFS-GFSv16 inputs initialized on July 01, 2022 @ 12 UTC (forecast at hour 000). Other external inputs for canopy related and other calculated variables are from numerous sources. See [Table 2](#table-2-canopy-app-required-input-variables) below for more information. **Note:** The example GFSv16 domain has been cut to the southeast U.S. region only in this example for size/time constraints here.
- Canopy-App assumes the NetCDF input files are in CF-Convention and test file is based on UFS-GFSv16; recommend using double or float for real variables. Input data must be valid values.
- Canopy-App can also be run with a single point of 1D input data in a text file (e.g. `input_variables_point.txt`).
- Canopy-App can also be run with a single point of 1D input data in a text file (e.g. `point_file_20220701.sfcf000.txt`).

The Canopy-App input data in [Table 2](#table-2-canopy-app-required-input-variables) below is based around NOAA's UFS operational Global Forecast System Version 16 (GFSv16) gridded met data, and is supplemented with external canopy data (from numerous sources) and other external and calculated input variables.

Expand Down Expand Up @@ -187,6 +187,10 @@ You can also [generate global inputs using Python (see python/global_data_proces
| Namelist Option | Namelist Description and Units |
| --------------- | ---------------------------------------------------------------------------------- |
| `infmt_opt` | integer for choosing 1D text (= `1`) or 2D NetCDF input file format (= `0`, default) |
| `time_start` | Start/initial time stamp in YYYY-MM-DD-HH:MM:SS.SSSS for simulation/observation inputs |
| `time_end` | End time stamp in YYYY-MM-DD-HH:MM:SS.SSSS for simulation/observation inputs |
| `ntime` | Number of time steps for simulation/observation inputs |
| `time_intvl` | Integer time interval for simulation/observation input time steps in seconds (default = 3600) |
| `nlat` | number of latitude cells (must match # of LAT in `file_vars` above) |
| `nlon` | number of longitude cells (must match # of LON in `file_vars` above) |
| `modlays` | number of model (below and above canopy) layers |
Expand Down
Binary file added input/gfs.t12z.20220630.sfcf023.canopy.nc
Binary file not shown.
3,699 changes: 3,699 additions & 0 deletions input/gfs.t12z.20220630.sfcf023.canopy.txt

Large diffs are not rendered by default.

Binary file modified input/gfs.t12z.20220701.sfcf000.canopy.nc
Binary file not shown.
7,396 changes: 3,698 additions & 3,698 deletions input/gfs.t12z.20220701.sfcf000.canopy.txt

Large diffs are not rendered by default.

Binary file added input/gfs.t12z.20220701.sfcf001.canopy.nc
Binary file not shown.
3,699 changes: 3,699 additions & 0 deletions input/gfs.t12z.20220701.sfcf001.canopy.txt

Large diffs are not rendered by default.

2 changes: 0 additions & 2 deletions input/input_variables_point.txt

This file was deleted.

17 changes: 14 additions & 3 deletions input/namelist.canopy
Original file line number Diff line number Diff line change
@@ -1,11 +1,22 @@
&FILENAMES
file_vars = 'input/gfs.t12z.20220701.sfcf000.canopy.nc'
! file_vars = 'input/gfs.t12z.20220701.sfcf000.canopy.txt'
file_out = 'output/test_out_southeast_us'
!2D Text and NCF Examples
! Recommend set file_out prefix to initial 'YYYY-MM-DD-HH-MMSS_region_identifier'
file_vars = 'input/gfs.t12z.20220630.sfcf023.canopy.nc' 'input/gfs.t12z.20220701.sfcf000.canopy.nc' 'input/gfs.t12z.20220701.sfcf001.canopy.nc'
! file_vars = 'input/gfs.t12z.20220630.sfcf023.canopy.txt' 'input/gfs.t12z.20220701.sfcf000.canopy.txt' 'input/gfs.t12z.20220701.sfcf001.canopy.txt'
file_out = 'output/2022-07-01-11-0000_southeast_us'

!1D Point Example
! Recommend set file_out prefix to initial 'YYYY-MM-DD-HH-MMSS_point_identifier'
! file_vars = 'input/point_file_20220630.sfcf023.txt' 'input/point_file_20220701.sfcf000.txt' 'input/point_file_20220701.sfcf001.txt'
! file_out = 'output/2022-07-01-11-0000_point'
/

&USERDEFS
infmt_opt = 0
time_start = '2022-07-01-11:00:00.0000'
time_end = '2022-07-01-13:00:00.0000'
ntime = 3
time_intvl = 3600
nlat = 43
nlon = 86
modlays = 100
Expand Down
2 changes: 2 additions & 0 deletions input/point_file_20220630.sfcf023.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
lat,lon,fh,ugrd10m,vgrd10m,clu,lai,vtype,ffrac,fricv,csz,sfcr,mol,frp,href,sotyp,pressfc,dswrf,shtfl,tmpsfc,tmp2m,spfh2m,hpbl,prate_ave
34.97,270.00,7.0925,-0.0897,2.2551,0.7206,0.8677,4,0.2156,0.1949,0.0236,0.2500,41.9993,7.3748,10.00,4,100620.8125,3.7858,-15.6539,293.3365,293.8452,0.0148,102.2751,0.0000
2 changes: 2 additions & 0 deletions input/point_file_20220701.sfcf000.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
lat,lon,fh,ugrd10m,vgrd10m,clu,lai,vtype,ffrac,fricv,csz,sfcr,mol,frp,href,sotyp,pressfc,dswrf,shtfl,tmpsfc,tmp2m,spfh2m,hpbl,prate_ave
34.97,270.00,7.0925,-0.1842,2.5479,0.7134,1.1774,4,0.2156,0.2770,0.2127,0.2500,-130.1519,0.0000,10.00,4,100694.0859,112.6779,14.5789,295.7195,295.4205,0.0156,162.2376,0.0000
2 changes: 2 additions & 0 deletions input/point_file_20220701.sfcf001.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
lat,lon,fh,ugrd10m,vgrd10m,clu,lai,vtype,ffrac,fricv,csz,sfcr,mol,frp,href,sotyp,pressfc,dswrf,shtfl,tmpsfc,tmp2m,spfh2m,hpbl,prate_ave
34.97,270.00,7.0925,0.0331,2.8131,0.7117,0.9407,4,0.2156,0.3292,0.4080,0.2500,-42.9638,0.0000,10.00,4,100735.8750,309.3322,74.7410,298.9967,297.7534,0.0160,313.8891,0.0000
5 changes: 3 additions & 2 deletions python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ You can override default namelist options by passing a dictionary to `run()`.
# Point setup
ds = run(
config={
"filenames": {"file_vars": "../input/input_variables_point.txt"},
"filenames": {"file_vars": "../input/point_file_20220701.sfcf000.txt"},
"userdefs": {"infmt_opt": 1, "nlat": 1, "nlon": 1},
},
)
Expand All @@ -40,8 +40,9 @@ There are also helper functions for running sets of experiments with different n
from canopy_app import config_cases, run_config_sens

cases = config_cases(
file_vars="../input/input_variables_point.txt",
file_vars="../input/point_file_20220701.sfcf000.txt",
infmt_opt=1,
ntime=1,
nlat=1,
nlon=1,
z0ghc=[0.001, 0.01],
Expand Down
165 changes: 127 additions & 38 deletions python/canopy_app.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,20 @@ def _load_default_config() -> f90nml.Namelist:
with open(REPO / "input" / "namelist.canopy") as f:
config = f90nml.read(f)

# Make input paths absolute in default config
for k, v in config["filenames"].items():
p0 = Path(v)
def as_abs(p_str: str) -> str:
p0 = Path(p_str)
if not p0.is_absolute():
p = REPO / p0
else:
p = p0
config["filenames"][k] = p.as_posix()
return p.as_posix()

# Make input paths absolute in default config
for k, v in config["filenames"].items():
if isinstance(v, list):
config["filenames"][k] = [as_abs(v_) for v_ in v]
else:
config["filenames"][k] = as_abs(v)

return config

Expand All @@ -58,15 +64,15 @@ def out_and_back(p: Path, *, finally_: Callable | None = None):


DEFAULT_POINT_INPUT = pd.read_csv(
REPO / "input" / "input_variables_point.txt", index_col=False
REPO / "input" / "point_file_20220701.sfcf000.txt", index_col=False
)

_TXT_STEM_SUFFS = {
"wind": "_output_canopy_wind",
"waf": "_output_waf",
"eddy": "_output_eddy_Kz",
"phot": "_output_phot",
"bio": "_output_bio",
"wind": "_canopy_wind",
"waf": "_waf",
"eddy": "_eddy",
"phot": "_phot",
"bio": "_bio",
}


Expand Down Expand Up @@ -120,20 +126,48 @@ def run(
ofp_stem = output_dir / "out"
full_config["filenames"]["file_out"] = ofp_stem.relative_to(case_dir).as_posix()

# Check input file
ifp = Path(full_config["filenames"]["file_vars"])
if not ifp.is_file():
raise ValueError(f"Input file {ifp.as_posix()!r} does not exist")
if not ifp.is_absolute():
full_config["filenames"]["file_vars"] = ifp.resolve().as_posix()
if ifp.suffix in {".nc", ".nc4", ".ncf"}: # consistent with sr `canopy_check_input`
nc_out = True
assert full_config["userdefs"]["infmt_opt"] == 0
elif ifp.suffix in {".txt"}:
nc_out = False
assert full_config["userdefs"]["infmt_opt"] == 1
# Check input file(s)
input_files_setting = full_config["filenames"]["file_vars"]
if isinstance(input_files_setting, list):
ifps_to_check = [Path(s) for s in input_files_setting]
else:
raise ValueError(f"Unexpected input file type: {ifp.suffix}")
ifps_to_check = [Path(input_files_setting)]
nc_outs = []
for ifp in ifps_to_check:
if not ifp.is_file():
raise ValueError(f"Input file {ifp.as_posix()!r} does not exist")
if ifp.suffix in {
".nc",
".nc4",
".ncf",
}: # consistent with sr `canopy_check_input`
nc_out = True
assert full_config["userdefs"]["infmt_opt"] == 0
elif ifp.suffix in {".txt"}:
nc_out = False
assert full_config["userdefs"]["infmt_opt"] == 1
else:
raise ValueError(f"Unexpected input file extension: {ifp.suffix}")
nc_outs.append(nc_out)
if not len(set(nc_outs)) == 1:
raise ValueError(
f"Expected all input files to be of the same type (nc or txt). "
f"filenames.file_vars: {input_files_setting}."
)
nc_out = nc_outs[0]
if isinstance(input_files_setting, list):
abs_path_strs = []
for s in input_files_setting:
ifp = Path(s)
if not ifp.is_absolute():
abs_path_strs.append(ifp.resolve().as_posix())
else:
abs_path_strs.append(ifp.as_posix())
full_config["filenames"]["file_vars"] = abs_path_strs
else:
ifp = Path(input_files_setting)
if not ifp.is_absolute():
full_config["filenames"]["file_vars"] = ifp.resolve().as_posix()

# Write namelist
if verbose:
Expand All @@ -153,7 +187,17 @@ def run(

# Load nc
if nc_out:
ds0 = xr.open_dataset(ofp_stem.with_suffix(".nc"))
# Should be just one file, even if multiple output time steps
patt = f"{ofp_stem.name}*.nc"
cands = sorted(output_dir.glob(patt))
if not cands:
raise ValueError(
f"No matches for pattern {patt!r} in directory {output_dir.as_posix()!r}. "
f"Files present are: {[p.as_posix() for p in output_dir.glob('*')]}."
)
if len(cands) > 1:
print("Taking the first nc file only.")
ds0 = xr.open_dataset(cands[0], decode_times=True)
ds = (
ds0.rename_dims(grid_xt="x", grid_yt="y")
.swap_dims(level="z")
Expand All @@ -180,22 +224,40 @@ def run(
f"warning: skipping {which!r} ({ifcan}) output since stem suffix unknown."
)
continue
df = read_txt(
ofp_stem.with_name(f"{ofp_stem.name}{stem_suff}").with_suffix(".txt")
)
# NOTE: Separate file for each time
patt = f"{ofp_stem.name}_*{stem_suff}.txt"
cands = sorted(output_dir.glob(patt))
if not cands:
raise ValueError(
f"No matches for pattern {patt!r} in directory {output_dir.as_posix()!r}. "
f"Files present are: {[p.as_posix() for p in output_dir.glob('*')]}."
)
if verbose:
print(f"detected output files for {ifcan}:")
print("\n".join(f"- {p.as_posix()}" for p in cands))
dfs_ifcan = []
for cand in cands:
df_t = read_txt(cand)
df_t["time"] = df_t.attrs["time"]
dfs_ifcan.append(df_t)
df = pd.concat(dfs_ifcan, ignore_index=True)
df.attrs.update(which=which)
df.attrs.update(df_t.attrs)
dfs.append(df)

# Merge
units: dict[str, str] = {}
dss = []
for df in dfs:
if {"lat", "lon", "height"}.issubset(df.columns):
ds_ = df.set_index(["height", "lat", "lon"]).to_xarray().squeeze()
elif {"lat", "lon"}.issubset(df.columns):
ds_ = df.set_index(["lat", "lon"]).to_xarray().squeeze()
if {"time", "lat", "lon", "height"}.issubset(df.columns):
ds_ = df.set_index(["time", "height", "lat", "lon"]).to_xarray().squeeze()
elif {"time", "lat", "lon"}.issubset(df.columns):
ds_ = df.set_index(["time", "lat", "lon"]).to_xarray().squeeze()
else:
raise ValueError("Expected df to have columns 'lat', 'lon' [,'height'].")
raise ValueError(
"Expected df to have columns 'time', 'lat', 'lon' [,'height']. "
f"Got: {sorted(df)}."
)
units.update(df.attrs["units"])
for vn in ds_.data_vars:
assert isinstance(vn, str)
Expand Down Expand Up @@ -230,22 +292,30 @@ def read_txt(fp: Path) -> pd.DataFrame:
with open(fp) as f:
for i, line in enumerate(f):
if i == 0:
pattern = r" *time stamp\: *([0-9\.\:\-]*)"
m = re.match(pattern, line)
if m is None:
raise ValueError(
f"Unexpected file format. Line {i} failed to match regex {pattern!r}."
)
time_stamp = pd.Timestamp(m.group(1))
elif i == 1:
pattern = r" *reference height, h\: *([0-9\.]*) m"
m = re.match(pattern, line)
if m is None:
raise ValueError(
f"Unexpected file format. Line {i} failed to match regex {pattern!r}."
)
href = float(m.group(1))
elif i == 1:
elif i == 2:
pattern = r" *number of model layers\: *([0-9]*)"
m = re.match(pattern, line)
if m is None:
raise ValueError(
f"Unexpected file format. Line {i} failed to match regex {pattern!r}."
)
nlay = int(m.group(1))
elif i == 2:
elif i == 3:
# Column names (some with units)
heads = re.split(r" {2,}", line.strip())
names: list[str] = []
Expand All @@ -264,17 +334,25 @@ def read_txt(fp: Path) -> pd.DataFrame:
break
else:
raise ValueError(
"Unexpected file format. Expected 3 header lines followed by data."
"Unexpected file format. Expected 4 header lines followed by data."
)

df = pd.read_csv(fp, index_col=False, skiprows=3, header=None, delimiter=r"\s+")
df = pd.read_csv(
fp,
index_col=False,
skiprows=4,
header=None,
delimiter=r"\s+",
dtype=np.float32,
)
if len(names) != len(df.columns):
raise RuntimeError(
f"Unexpected file format. Detected columns names {names} ({len(names)}) "
f"are of a different number than the loaded dataframe ({len(df.columns)})."
)
df = df.replace(np.float32(-9.0e20), np.nan) # fill value, defined in const mod
df.columns = names # type: ignore[assignment]
df.attrs.update(href=href, nlay=nlay, units=units)
df.attrs.update(href=href, nlay=nlay, units=units, time=time_stamp)

return df

Expand Down Expand Up @@ -367,6 +445,16 @@ def config_cases(*, product: bool = False, **kwargs) -> list[dict[str, Any]]:
f"scalar, list[scalar], or list[list[scalar]], got: {type(v)}."
)

if k == "file_vars":
# Only support single time step runs for now
if type(v) is list:
assert type(v[0]) is str
mults[k] = v
else:
assert type(v) is str
sings[k] = v
continue

if (np.isscalar(DEFAULT_CONFIG[_k_sec(k)][k]) and np.isscalar(v)) or (
type(DEFAULT_CONFIG[_k_sec(k)][k]) is list
and type(v) is list
Expand Down Expand Up @@ -414,8 +502,9 @@ def config_cases(*, product: bool = False, **kwargs) -> list[dict[str, Any]]:

if __name__ == "__main__":
cases = config_cases(
file_vars="../input/input_variables_point.txt",
file_vars="../input/point_file_20220701.sfcf000.txt",
infmt_opt=1,
ntime=1,
nlat=1,
nlon=1,
z0ghc=[0.001, 0.01],
Expand Down
Loading