Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During training, obtaining a Resolution of None Against an Anemoi Data Processed ERA5 Zarr #68

Open
CSyl opened this issue Oct 1, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@CSyl
Copy link

CSyl commented Oct 1, 2024

What happened?

I have a zarr that is the subset of the ERA5 zarr in the gcp storage, https://console.cloud.google.com/storage/browser/gcp-public-data-arco-era5/ar/1959-2022-1h-360x181_equiangular_with_poles_conservative.zarr. When running anemoi-training train & anemoi-training train --config-name=debug.yaml, I encountered the following error:

"AttributeError: "NoneType" object has no attribute 'lower'" from the anemoi/training/data/datamodule.py line 101 in _check_resolution.

If the resolution must be set within the configuration file, is there a way to verify what the resolution of the https://console.cloud.google.com/storage/browser/gcp-public-data-arco-era5/ar/1959-2022-1h-360x181_equiangular_with_poles_conservative.zarr is in terms of the "o" resolution or in whatever prefix term(s) the anemoi-training module will accept as shown in the anemoi/training/config/data/zarr.yaml file so, that I can set the resolution in the configuration file for training - perhaps that will remove this error of "NoneType" that I am recieving?

What are the steps to reproduce the bug?

Data & Graph used: Using a anemoi-formatted ERA5 zarr subset extracted from https://console.cloud.google.com/storage/browser/gcp-public-data-arco-era5/ar/1959-2022-1h-360x181_equiangular_with_poles_conservative.zarr and preprocessed using the anemoi-datasets module & created a graph against the anemoi-formatted ERA5 zarr subset generated using the anemoi-graphs module.

  1. Configuration file (config.yaml) I am using for training module is:
defaults:
- data: zarr
- dataloader: native_grid
- diagnostics: eval_rollout
- hardware: example
- graph: multi_scale
- model: gnntransformer
- training: default
- _self_
  1. Data Configuration file (anemoi/training/config/data/zarr.yaml) for training module I am using is:
format: zarr
resolution: o384 #o96
# Time frequency requested from dataset
frequency: 1h #6h
# Time step of model (must be multiple of frequency)
timestep: 1h #6h

# features that are not part of the forecast state
# but are used as forcing to generate the forecast state
forcing:
- "cos_latitude"
- "cos_longitude"
- "sin_latitude"
- "sin_longitude"
- "cos_julian_day"
- "cos_local_time"
- "sin_julian_day"
- "sin_local_time"
- "insolation"
- "lsm"
- "sdor"
- "slor"
- "z"
# features that are only part of the forecast state
# but are not used as the input to the model
diagnostic:
- tp
- cp
remapped:

normalizer:
  default: "mean-std"
  min-max:
  max:
  - "sdor"
  - "slor"
  - "z"
  none:
  - "cos_latitude"
  - "cos_longitude"
  - "sin_latitude"
  - "sin_longitude"
  - "cos_julian_day"
  - "cos_local_time"
  - "sin_julian_day"
  - "sin_local_time"
  - "insolation"
  - "lsm"

imputer:
  default: "none"
remapper:
  default: "none"

# processors including imputers and normalizers are applied in order of definition
processors:
  # example_imputer:
  #   _target_: anemoi.models.preprocessing.imputer.InputImputer
  #   _convert_: all
  #   config: ${data.imputer}
  normalizer:
    _target_: anemoi.models.preprocessing.normalizer.InputNormalizer
    _convert_: all
    config: ${data.normalizer}
  # remapper:
  #   _target_: anemoi.models.preprocessing.remapper.Remapper
  #   _convert_: all
  #   config: ${data.remapper}

# Values set in the code
num_features: null # number of features in the forecast state

  1. Dataloader Configuration file (anemoi/training/config/dataloader/native_grid.yaml) for training module I am using is:
prefetch_factor: 2

num_workers:
  training: 8
  validation: 8
  test: 8
  predict: 8
batch_size:
  training: 2
  validation: 4
  test: 4
  predict: 4

# ============
# Default effective batch_size for training is 16
# For the o96 resolution, default per-gpu batch_size is 2 (8 gpus required)
# The global lr is calculated as:
# global_lr = local_lr * num_gpus_per_node * num_nodes / gpus_per_model
# Assuming a constant effective batch_size, any change in the per_gpu batch_size
# should come with a rescaling of the local_lr to keep a constant global_lr
# ============

# runs only N training batches [N = integer | null]
# if null then we run through all the batches
limit_batches:
  training: null
  validation: null
  test: 20
  predict: 20

# ============
# Dataloader definitions
# These follow the anemoi-datasets patterns
# You can make these as complicated for merging as you like
# See https://anemoi-datasets.readthedocs.io
# ============

dataset: ${hardware.paths.data}/${hardware.files.dataset}

training:
  dataset: ${dataloader.dataset}
  start: 2020-12-31 00:00:00 #null
  end: 2021-01-20 23:00:00 #2021
  frequency: ${data.frequency}
  drop:  []

validation:
  dataset: ${dataloader.dataset}
  start: 2021-01-21 00:00:00 #2021
  end: 2021-01-24 23:00:00 #2021
  frequency: ${data.frequency}
  drop:  []

test:
  dataset: ${dataloader.dataset}
  start: 2021-01-25 00:00:00 #2021
  end: 2021-02-01 23:00:00 #null
  frequency: ${data.frequency}

  1. Hardware Data Configuration I am using is:
data: /Location of where the anemoi-formatted zarr is saved within was add here
grids: ???
output: /Location of where to save the training log was added here
logs:
  base: ${hardware.paths.output}logs/
  wandb: ${hardware.paths.logs.base}
  mlflow: ${hardware.paths.logs.base}mlflow/
  tensorboard: ${hardware.paths.logs.base}tensorboard/
checkpoints: ${hardware.paths.output}checkpoint/
plots: ${hardware.paths.output}plots/
profiler: ${hardware.paths.output}profiler/
graph: ${hardware.paths.output}graphs/
  1. Configuration file for anemoi-graphs module I am using is:
# Encoder-Processor-Decoder graph
# Note: Resulting graph will only work with a Transformer processor because there are no connections between the hidden nodes.
nodes:
  data:
    node_builder: # how to generate data node
      _target_: anemoi.graphs.nodes.ZarrDatasetNodes
      dataset: anemoi-local-gcp-sample-zarr.zarr
  hidden:
    node_builder: # how to generate hidden node
      _target_: anemoi.graphs.nodes.ZarrDatasetNodes
      dataset: anemoi-local-gcp-sample-zarr.zarr
edges:
  # A) Encoder connections/edges: Encodes input data intolatent space via connecting data nodes w/ hidden nodes.
  - source_name: data
    target_name: hidden
    edge_builder:
      _target_: anemoi.graphs.edges.CutOffEdges # method to build edges 
      cutoff_factor: 0.7
  # B) Decoder connections/edges: Decodes latent space into the output data via connecting hidden nodes w/ data nodes 
  - source_name: hidden
    target_name: hidden
    edge_builder:
      _target_: anemoi.graphs.edges.KNNEdges # method to build edges via KNN
      num_nearest_neighbours: 3
 # C) Processor connections/edges
  - source_name: hidden
    target_name: data
    edge_builder:
      _target_: anemoi.graphs.edges.KNNEdges  # method to build edges via KNN
      num_nearest_neighbours: 3
  1. Executed anemoi-training train --config-name=config.yaml
    & obtained the error:

AttributeError: "NoneType" object has no attribute 'lower' " from the anemoi/training/data/datamodule.py line 101 in _check_resolution.

Version

0.1.0

Platform (OS and architecture)

Linux

Relevant log output

No response

Accompanying data

No response

Organisation

No response

(ccing @mchantry )

@CSyl CSyl added the bug Something isn't working label Oct 1, 2024
@CSyl CSyl changed the title During training, obtaining a resolution of None. During training, obtaining a Resolution of None Against an Anemoi Data Processed ERA5 Zarr Oct 9, 2024
@mchantry
Copy link
Member

Hi CSyl
Sorry for the slow reply.
Could you provide access to a small Anemoi dataset -style zarr so we can understand how the resolution has been described in the dataset.
Or a config for anemoi datasets on how you have built the zarr.
Thanks

@CSyl
Copy link
Author

CSyl commented Nov 5, 2024

Hi @mchantry,
No worries. Thank you for the response. The following is the configuration file mentioned in the steps below was used to transform the zarr data into an anemoi dataset:

Steps Taken when the Zarr was converted to an anemoi dataset:

  1. Saving a subset of the zarr from GCP storage to local:
# Generate an ERA5 sample from GCP's GS storage
import xarray as xr
import gcsfs
gs_url = "gs://gcp-public-data-arco-era5/ar/1959-2022-1h-360x181_equiangular_with_poles_conservative.zarr"
chunk_sz = 48
gcp_ar_era5_subset = xr.open_zarr(gs_url, 
                    chunks={'time': chunk_sz},
                    consolidated=True)
start_date = '2020-12-31'
end_date = '2021-02-01'
gcp_ar_era5_subset = gcp_ar_era5_subset.sel(time=slice(start_date, end_date))

# Save ERA5 data subset to local 
gcp_ar_era5_subset.to_zarr('gcp_era5_subset.zarr')
  1. Set YAML configuration file (recipe.yaml) for the Zarr to be convert to an anemoi dataset:
dates:
  start: 2020-12-31T00:00
  end: 2021-02-01T23:00
  frequency: 6h

input:
  xarray-zarr:
    url: "./gcp_era5_subset.zarr"
    param: [2m_temperature,
    10m_u_component_of_wind,
    geopotential,
    10m_v_component_of_wind,
    surface_pressure]
  1. Execute anemoi-datasets create recipe.yaml gcp_era5_subset.zarr

@HCookie
Copy link
Member

HCookie commented Nov 15, 2024

Hi, @CSyl
Thank you for providing your dataset script.
I will take a look next week and attempt to reproduce the error.

@HCookie
Copy link
Member

HCookie commented Nov 21, 2024

As the data you are building an anemoi-dataset from is a source we cannot establish a resolution from, it is expected behaviour that it is None.
It is primarily used for metadata tracking, cataloguing, and inspection.
In #120, we are removing the check of the resolution, as expect for the line in which you crash on, it is not used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants