Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust lon/lat initialization for imagestack #47

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

Lmoesinger
Copy link

@Lmoesinger Lmoesinger commented Sep 3, 2018

The current lon/lat initialization of ImageStack can break if the grid has an unusual row/col arrangement or an unusual shape. E.g. if you create an imagestack with this grid all grid points will have the same lat and lon:

import pygeogrids.grids as grids
from pynetcf.image import ImageStack
import pandas as pd

grid = grids.genreg_grid(0.25, 0.25)
filename = '/path/to/file/filename.nc'
times = pd.date_range('2000', '2001', freq='M').to_pydatetime()
ds = ImageStack(filename, grid=grid, times=times, mode='w')
print ds.dataset.variables['lon'][:]
# all lons are -179.875

The proposed change uses np.unique() to get the lat/lon values, which works independent of the grids orientation.
I ran the tests with the changes and broke none of them, but it still might now screw up some other grid.

The previous initialization of lon/lat could break if the grid had an unusually row/col arrangements or unusual shape. np.unique() is a bit slower but more robust. The sort() is probably redundant, but just wanted to be on the save side.
removed redundant sort
@cpaulik
Copy link
Collaborator

cpaulik commented Sep 3, 2018

Just some general thoughts:

The whole image part of pynetcf is used very little and not very well developed. I'm not sure if we should even allow the link of pygeogrids.BasicGrid and an ImageStack since a BasicGrid object can be any discrete grid. The lat2d and lon2d attributes do exist but I think we should probably make a separate class for grids with these limitations.

@sebhahn should decide this PR and the future direction of the image part depending on the needs of TUW.

@Lmoesinger
Copy link
Author

Lmoesinger commented Sep 4, 2018

@sebhahn as discussed, the np.unique() calls are now moved to the 'ulat' and 'ulon' variables from pygeogrids.basegrid. This push now requires TUW-GEO/pygeogrids#56 to pass first.

Also added lon/lat check (if lon/lat is broken, it will read the wrong gpis)

@cpaulik
Copy link
Collaborator

cpaulik commented Sep 4, 2018

I think this could cause some issues.

Before you where using the lon2d and lat2d attributes which where only set if the grid had a 2D shape. So there was some guarantee that the grid is intended for 2D image use.

Using ulat and ulon which are defined for any grid could easily give a netcdf file which is not correct and where the data that is on the grid is sorted differently from the dimensions that are created.

@sebhahn
Copy link
Member

sebhahn commented Sep 4, 2018

The way I see it is that both approaches are not 100% fail safe. The question is if imagestack is for 2d grids only or can be extended to non-2d grids?

@cpaulik
Copy link
Collaborator

cpaulik commented Sep 5, 2018

There are several options for coordinate systems that we could support (http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#coordinate-system) . A non-2d grid ImageStack would be best represented as an Orthogonal multidimensional array representation with different chunking. For that we already have ArrayStack

@sebhahn
Copy link
Member

sebhahn commented Sep 5, 2018

I can remember we had this discussion some time ago and I wasn't too happy about the naming (ArrayStack and ImageStack). Coming back to this two classes it still confuses me. There seems to me not 1-to-1 relationship to CF coordinate systems, but rather an abstract layer to combine the reader with a grid, right? @cpaulik could you please shortly summarize the distinct differences and use cases of ArrayStack and ImageStack? I have to admit that I haven't been using them a lot, because I'm not an Image user. I hope clarifying the meaning helps to understand if we need two classes here or just a general one.

@cpaulik
Copy link
Collaborator

cpaulik commented Sep 5, 2018

The goal of this package is to write CF conform netCDF files. I think we should completely rewrite the Image part of the package to conform to CF naming and coordinate conventions. As we did with the time series.

The ArrayStack and ImageStack are classes I wrote without thinking too much and I have not used them because they do not really work nicely. The Idea behind ArrayStack was to have a Stack of 1D arrays. Basically Ortho multi but with different chunking to favor image access. The ImageStack was for 2D arrays or images as I called them. I have also never really used this which is the reason why it is so poorly maintained.

What is the current use case for the ImageStack that triggered this pull request? That might help in the decision of how to move forward.

@sebhahn
Copy link
Member

sebhahn commented Sep 5, 2018

@Lmoesinger could you please elaborate on your use case?

@Lmoesinger
Copy link
Author

Lmoesinger commented Sep 6, 2018

I´m using it to convert time series (stored as GriddedNcContiguousRaggedTs, divided into multiple files by a cellgrid) to to a stack of 2d matrices by iterating over each gpi. All time series have the same time indexes. I know the dates beforehand as they are regularly spaced between a known start- and enddate. I`m only writing one time series per GPI.

Basically it is very similar to Ts2Img from https://github.com/TUW-GEO/repurpose/blob/master/repurpose/ts2img.py .

Also note that it wasn't my goal to make this working with grids that don't have a shape - rather my goal was to make it work with grids that have an unusual shape.

@cpaulik
Copy link
Collaborator

cpaulik commented Sep 6, 2018

I think we can discuss this tomorrow before lunch @sebhahn . @Lmoesinger if you also work at the TU then we might also meet tomorrow.

@Lmoesinger
Copy link
Author

Lmoesinger commented Sep 6, 2018

Sure, at what time and where?
It might also be an opportunity to discuss TUW-GEO/pytesmo#144

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants