-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collection with monthly CESM output files (history files) #55
Comments
@AJueling, do you mind if I transfer this issue to this https://github.com/NCAR/intake-esm-datastore repo instead? I am planning on commenting once it's there |
Thanks for the quick reply! I don't mind if you move it, of course. (I was not sure where to ask this in the first place.) |
Are you working with time-slices (history files i.e. do you have one time step in each file with a bunch of data variables) or time-series (multiple time steps with one data variable)? As @matt-long pointed out in intake/intake-esm#112
Unfortunately, this issue of
If you were working with time-series (single data variable per file), the following would address the issue:
{
"esmcat_version": "0.1.0",
"id": "CESM_simulations",
"description": "This is an ESM collection for CESM1 simulations.",
"catalog_file": "simulations.csv",
"attributes": [
{
"column_name": "component",
"vocabulary": ""
},
{
"column_name": "frequency",
"vocabulary": ""
},
{
"column_name": "experiment",
"vocabulary": ""
},
{
"column_name": "variable",
"vocabulary": ""
},
{
" column_name": "time_range",
"vocabulary": ""
}
],
"assets": {
"column_name": "path",
"format": "netcdf"
},
"aggregation_control": {
"variable_column_name": "variable",
"groupby_attrs": [
"component",
"experiment",
"stream"
],
"aggregations": [
{
"type": "union",
"attribute_name": "variable"
},
{
"type": "join_existing",
"attribute_name": "time_range",
"options": {
"dim": "time",
"coords": "minimal",
"compat": "override"
}
}
]
}
} For reference, take a look at the collection for CESM2 runs (timeseries): https://github.com/NCAR/intake-esm-datastore/blob/master/catalogs/campaign-cesm2-cmip6-timeseries.json. |
@andersy005 thank you for the reply. I am indeed working with time slice files that contain many variables which is the standard output format of CESM as far as I know. It's good to know that it does not work for my use case and I will use a different approach. I suppose we can close this for now and I will follow @matt-long's issue for any updates. |
It's likely that this issue is of interest to other users. So, Let's leave it open (as a reference) until the multi variable files are supported. |
@AJueling, just wanted to let you know that we've been working on functionality for building and using catalogs for CESM runs. Recently, @mgrover1 put together a great blog post with details on how to build a catalog from CESM history files: https://ncar.github.io/esds/posts/ecgtools-history-files-example/ |
We have many different CESM simulations and I would like to create an esm-intake collection of them. The output files are monthly mean netcdf files and contain many variables.
I have created a
collection.json
file:and with a
simulations.csv
:I can create a catalogue
cat = intake.open_esm_datastore('collection.json').search(experiment=['CTRL'])
which results inbut when I create a dataset with
dset_dict = cat.to_dataset_dict(cdf_kwargs={'decode_times': False})
it returns a dataset with only a single time coordinate:resulting xarray dataset
calling
dset_dict['ocn.monthly.CTRL']
yieldsHow do I concatenate along the time axis?
The text was updated successfully, but these errors were encountered: