-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write data cube guide #26
Comments
In the search for a good visual representation, here are some first ideas: I like the way things are displayed in the R I have another idea that, in my view, is able to explain the sort of data that is held in DataCubes and therefore can show that DataCubes are n-dimensional (here: time, 3 bands and x, y). A first sketch (ignoring the structure on the bottom right): |
Great sketches! That last one suggests that B2, B3, B4 and B8 are distributed over two dimensions, which is not very intuitive IMO, but showing that dimensions can be exchanged makes some sense. I put the R scripts that generated above figures at https://gist.github.com/edzer/5f1b0faa3e93073784e01d5a4bb60eca |
Yeah, I think your first sketch works very well with some more details. Spatial are x and y, z is the bands and could be visualized with different colors (e.g. different shades of the color per pixel, one band red, one band green, one band blue) and then have each timestamp be part of your timeline. Vector cubes in openEO are not really a thing at the moment so we could skip that part for now, but if you have good ideas, feel free to write them down anyway and we can have them in a separate markdown file for now. |
I like that a lot, well done! Could you change the pink color to yellow or so? I find it hard to distinguish from the red above... or change the order of the colors to not have red and pink directly after each other. |
Whether 2020-09-28 has data or not depends on the upsampling method you use. Would it make sense to just remove the empty timestamp as indeed it would likely not be in the resulting data cube (or at least would be there with no-data). I think I'm fine with not giving more details on the resampling, but maybe it's easier to understand if you change the label "resample" to "resample to"? All the images look the same, which may confuse some, but overall I like the image. 👍 |
Yeah, this is the tricky part because I think if it's there with no data, then the current image is exactly right. But if this is dependent on the resampling function I will delete the point for the first date, it's more intuitive.
sure! good idea.
Yes, I will change this. Downsampling method will then be "mean" if that's alright. EDIT: things won't look so different then I'm afraid. Ideas to change that? 2nd EDIT: input is actually already displaying different time steps. Is the difference too subtle at this scale? |
Yeah, I now see that there's a subtle difference, but you need to look very closely to figure it out. Not sure whether that is actually an issue though. I guess we can leave it as it is for now. Changes in times series are often pretty subtle... Other than that, the image looks good to me, thanks! 👍 |
I have some questions about the spatial aggregation processes:
Regarding the previous discussion
Indeed. I think that in most graphics these very subtle changes are ok (as you say, we can always change that later on). They also result from the fact that breaks are set automatically for each raster. In the case where this is important (apply graphics, looking at single pixel values), I manually set breaks (so far only for third graphic). edit @m-mohr |
Not sure. I think not in the next 6 months at least.
Yes, I think that is fine for me.
It is basically the same, just the way it reduces the values is different.
binary uses a reducer (see e.g. the JS reduce operation) which works on two values, which allows reducing of very large lists that would otherwise exceed the memory. The list variant (i.e. non-binary) works on a list directly. So it's mostly a thing to optimize the operation for very large data. |
@jonathom In this thread Open-EO/openeo-processes#215 (comment) we discussed that we should add some guidance that data cubes (child) processes should be careful with data type changes. Like if it gets an array of numbers in a reducer, should also return a number and not e.g. a string or array. Could you add that somewhere in the general data cube descriptions, please? cc for review: @soxofaan |
@m-mohr I'm not entirely sure if I understand what's going on, so let's discuss in next meeting. First thought: Maybe this is something for the cookbook (#16), since it is much more "how to do" instead of "how does it work"? Also, the cookbook could then just have a whole first section dedicated to "how to work with datacubes" to be a further reference after the datacube guide (not only because of this, just generally). |
Nice diagrams! Some feedback/ideas:
|
Thank you for the feedback @soxofaan! The datacube guide with much more graphics is already online and a version with some of your corrections (type, title change) can be seen here. I'd be happy if you want to have a look and leave some more feedback! Regarding two points from above:
|
these online docs look very pretty, nice improvement! |
@jonathom We also forgot to remove the Data Cube desction from the glossary: https://openeo.org/documentation/1.0/glossary.html Another thing we should talk about in the "Dimensions" section is that the dimensions can have special characteristics, e.g. spatial and temporal are expected to have a natural order, temporal are by default Gregorian calendar, ... |
done, collecting these fixes in branch "dcguide". I added the old glossary datacube md in datacubes/.scripts for later reference. additional note to myself: also forgot to talk about crs as dimension, as in old glossary |
I don't think this is required, we have version control for this. Let's discuss later |
This is all done, right @jonathom ? Feel free to close then. |
Really nice guide! I've just seen it and it will be super useful for many others. |
It became obvious several times in openEO history that people often are not aware of how data cubes and their methods (reduce, apply, ...) work. So I was thinking that a guide how to work with data cubes would help the understanding, step by step with examples.
Discussion in Open-EO/openeo-processes#215 (comment) have shown that the document should say that it's usually not a good idea to change data types in apply/reduce/... and probably also list other pitfalls and potential limitations.
The text was updated successfully, but these errors were encountered: