Skip to content

Commit

Permalink
Add documentation of data processing steps
Browse files Browse the repository at this point in the history
  • Loading branch information
chpolste committed Aug 28, 2023
1 parent a4115e4 commit b25f7cc
Showing 1 changed file with 50 additions and 3 deletions.
53 changes: 50 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,62 @@ Clone this repository:
- Python 3 with packages listed in [`requirements.txt`](requirements.txt) (the specified versions were used during development, older/newer versions may or may not work)


### Data Analysis and Plots
### Data Download, Processing and Plots

...
$ make

is configured to download and process all required data and create all figures.
The following steps are carried out:

- Download of ERA5 data (see notes below, `src/download_ERA5.py`).
- Compute climatological means of ERA5 variables and PV on isentropes (`src/calculate_means.py`).
- Plot Figure 1 (`src/plot_barotropic.py`).
- Plot Figure 2 (`src/plot_schematic.py`).
- Compute isentropic Ertel PV from the ERA5 data (`src/calculate_pv.py`) and its zonal wavenumber spectrum of PV (`src/calculate_sprops.py`).
- Compute 14-day rolling mean of PV (`src/calculate_rollmean.py`) and its zonal wavenumber spectrum.
- Compute the zonal wavenumber spectrum of climatological mean PV.
- Compute 90° rolling zonalized PV on 330 K, its climatological mean and zonal wavenumber spectrum and analyze waveguide occurrence.
- Compute 60° rolling zonalized PV on 330 K (`src/calculate_pvrz.py`), its climatological mean and zonal wavenumber spectrum and analyze waveguide occurrence.
- Compute 60° rolling zonalized PV on 345, 340, 335, 325, 320 and 315 K and analyze waveguide occurrence.
- Plot Figure 3 (`src/plot_climatology.py`).
- Plot Figure 4 (`src/plot_episode.py`).

Python extensions are compiled as needed in between.
Use the `--dry-run` option of make to see which commands will be run without executing them.
It is generally not necessary/recommended to parallelize with the `-j` option of make.
The data processing is already parallelized with the default [dask](https://www.dask.org/) scheduler.
Most scripts will provide a progress bar while running.

The approximate size of the downloaded ERA5 dataset is 190 GB.
To start the downloads without the data processing, use

$ make reanalysis

If you already have a similar dataset containing temperature and horizontal wind components on pressure levels, it should generally be possible to substitute these files.
A few changes to the `Makefile` will be necessary, e.g. setting new file paths and adapting the number of timesteps in the window for the rolling temporal mean.


### Output

...
NetCDF files with intermediate processed fields are written to the `data` directory.
File names use prefixes

- `data/PV-*.nc`: potential vorticity,
- `data/PVrm-*.nc`: 14-day rolling-mean potential vorticity,
- `data/PVrz-*.nc`: rolling-zonalized potential vorticity,

and suffixes

- `data/*-mean.nc`: climatological and seasonal means,
- `data/*-sprop.nc`: mean zonal wavenumber spectrum,
- `data/*-occur.nc`: waveguide occurrence frequencies.

Figures are written to the `figures` directory:

- Figure 1: `figures/barotropic.pdf`,
- Figure 2: `figures/schematic.pdf`,
- Figure 3: `figures/climatology.pdf`,
- Figure 4: `figures/episode.pdf`.


## The `waveguide` Python Package
Expand Down

0 comments on commit b25f7cc

Please sign in to comment.