Initialize `zt` before calling `proj_trans` #104

lnicola · 2022-02-04T13:13:08Z

CC #99

r? @michaelkirk

michaelkirk · 2022-02-04T16:41:23Z

bors try

bors · 2022-02-04T16:50:46Z

try

Build succeeded:

ci result

lnicola · 2022-02-04T17:06:36Z

To expand on the "not strictly correct" remark: in C++ (unlike C) you're not allowed to read a different union member than the one you've written to last.

But proj_trans calls pj_get_suggested_operation which reads xyzt, then pj_fwd4d -> fwd_prepare, which reads v. Finally, it calls through a function pointer which reads either xyzt (I suppose), xyz or xy, depending on what's available in PJ.

So we can only hope for the best here.

lnicola · 2022-02-04T17:07:53Z

bors try

bors · 2022-02-04T17:17:53Z

try

Build succeeded:

ci result

michaelkirk · 2022-02-04T19:14:13Z

I'm not sure if you saw, but I had a PR that similarly juggled with the union type and produced passing tests. But as I didn't understand how it fixed things, I figured I just fixed the tests, not the code... that the bug was just as likely to emerge for other peoples use cases.

Your c++ union observation is interesting, but it's still not clear to me how it fixes anything.

lnicola · 2022-02-04T19:30:32Z

I think that PR is orthogonal to this one. This one is not fixing UB caused by reading the wrong union field.

This PR initializes z and t before passing them to proj. I don't know if that's happening, but I can imagine how passing NaN as the time might result in NaN as the output. See https://github.com/OSGeo/PROJ/blob/8.1/src/fwd.cpp#L195-L200.

frewsxcv · 2022-02-06T16:24:22Z

src/proj.rs

+            lam: c_x,
+            phi: c_y,
+            z: 0.0,
+            t: f64::INFINITY,


Curious, why did you choose infinity here?

It's what libproj does in proj_transform_generic (or another function with a similar name).

PROJ dates back to before-or-about-the-same-time as IEEE-754 arithmetic, and hence before the NaN concept was common. So all the way from Jerry Evenden's earliest PROJ work around 1980, HUGE_VAL has been used as the exceptional value in PROJ. Often HUGE_VAL is defined as infinity for C stdlibs running on a IEEE-754 platform, so it makes much sense here.

There was a recent PROJ discussion (which I cannot find atm), focusing on the reasonable idea of changing from HUGE_VAL to NaN. That would, however, be a breaking change. In Rust Geodesy I try to use NaN, when None or Err is not a better option.

michaelkirk · 2022-02-10T01:46:02Z

Sorry to be slow to respond to this...

This PR initializes z and t before passing them to proj. I don't know if that's happening, but I can imagine how passing NaN as the time might result in NaN as the output. See https://github.com/OSGeo/PROJ/blob/8.1/src/fwd.cpp#L195-L200.

I'm not sure what's significant about the code you linked to.

if (P->fwd)
        coo.xy = P->fwd(coo.lp, P);
    else if (P->fwd3d)
        coo.xyz = P->fwd3d (coo.lpz, P);
    else if (P->fwd4d)
        coo = P->fwd4d (coo, P);

Presumaby we're hitting this branch:

 if (P->fwd)
        coo.xy = P->fwd(coo.lp, P);

You think that somewhere in that branch proj is reading something other than the first two floats in coo?

lnicola · 2022-02-10T05:10:08Z

I don't know how to check, but the first result from projinfo has a Helmert step with a non-zero +z parameter, so it might be at least 3D.

lnicola · 2022-02-10T05:59:31Z

So:

$ cs2cs EPSG:4326 EPSG:3309
34.095620 -118.283555 0
158458.67	-434296.88 0.00
34.095620 -118.283555 1000
158458.66	-434296.88 1000.00
34.095620 -118.283555 2000 
158458.65	-434296.88 2000.00

It's small, but notice how the output x depends on z. If z was NaN, it might propagate over to it.

busstoptaktik · 2022-02-10T17:39:26Z

In general, you should always expect geodetic operations to be four dimensional - the exception being when going from geographical (or perhaps projected) coordinates on the ellipsoid, to (other) projected coordinates on the same ellipsoid and in the same datum: In these cases, most 2D systems are more or less equivalent: From a geodetic point of view, it doesn't matter whether you have geographical or UTM coordinates: As long as they are referred to the same datum/reference frame, you can always convert one to the other at arbitrary precision, since you are doing entirely mathematical operations in an ideal mathematical world.

Once you need to convert between different reference frames, geophysics and the irregularities of the real world enters and you need to acknowledge that the world is fundamentally 4 dimensional. When you work in two dimensions, you are limited to either stay in one reference frame, or accept a probable overall accuracy at the metre level (which of course is quite satisfactory for many - or most- real world applications)

lnicola · 2022-02-10T17:45:29Z

Thanks for chiming in, @busstoptaktik.

In this case (our failing test converts EPSG:4326 to EPSG:3309), the two seem to use different ellipsoids, and even with my black-box knowledge of libproj, it seems bad practice to pass uninitialized variables without a guarantee that they are not going to be used.

lnicola · 2022-02-10T17:55:40Z

@michaelkirk how do you feel about merging this? We could try to rebase #99 on top of it to see if it helped.

michaelkirk · 2022-02-10T18:25:35Z

Thanks for your investigation @lnicola and thanks to @busstoptaktik also for adding some context!

One of the things I was caught up on was why our code would be broken, when the equivalent code from c presumably isn't — why don't people writing C programs have to jump through the hoops in this PR?

And I think the answer is that I don't know very much about C.

From (from https://en.cppreference.com/w/c/language/struct_initialization)

All members that are not initialized explicitly are zero-initialized

I don't think rust makes any such guarantees - the z/t dimensions could be whatever garbage was left in them.

Based on this... I wonder if we should be initializing t to zero:

- t: f64::INFINITY,
+ t: 0.0

...to be in line with what proj expects to gets from a c user of the "two dimensional" api. WDYT?

We could try to rebase #99 on top of it to see if it helped.

Good idea - I've just merged your branch into #99 here: d1eaaee

Let's make sure that extended CI passes, then I'm fully on board with merging this!

lnicola · 2022-02-10T18:44:49Z

Based on this... I wonder if we should be initializing t to zero:

I think it's safe to do what libproj does, i.e. use INFINITY: https://github.com/OSGeo/PROJ/blob/8.1/src/4D_api.cpp#L492-L510.

michaelkirk · 2022-02-10T18:57:11Z

Ok, let's do it then!

bors r+

Thanks again!

bors · 2022-02-10T19:07:42Z

Build succeeded:

ci result

busstoptaktik · 2022-02-10T19:42:41Z

@michaelkirk said:

I wonder if we should be initializing t to zero

zero would indicate that the valid time of the coordinate is at the current epoch, i.e. approx 2000 years ago. Since that is probably not the case for any majority of digital geographic resources, it may be better to select a more current ballpark estimate - say, 2000.0, if you do not have any better estimate?

At an average geological plate velocity of 3 cm/y, the 2000 year evolution would translate into a 60 m dscrepancy, whereas the 20-something years of selecting 2000 as ballpark estimate would only translate into 60 cm.

lnicola · 2022-02-10T20:23:31Z

Oh, is time really used like that?

I assumed it was used only to select an appropriate transform when the definitions changed in the past. And I thought using infinity made sense, in order to pick the most recent definitions available.

michaelkirk · 2022-02-11T04:38:34Z

I appreciate your bringing up how important the time dimension is @busstoptaktik. We should add support for that to georust/proj. But I'd like to make sure we're on the same page here with this most recent change...

The problem that this PR aimed to address was wildly inconsistent results from georust/proj. e.g. see here https://github.com/georust/proj/runs/5013324366?check_suite_focus=true where tests sometimes fail (and sometimes don't).

This repository only supports 2-D points. It's possible to use libproj with 2-D points and get consistent results (though it clearly can't account for plate shifting with only 2-D points). Right?

But the errors we were seeing aren't even at the level of geological plate shifting — it's about us making the wrong assumptions about how rust interacts with C.

Thanks to @lnicola, we now have a theory and a merged (presumed) fix for the gap in behavior between what a user of georust/proj would see vs a direct consumer of libproj.

I'd like to ship out this fix as soon as possible, so existing users can get the fix and so that we can move on to other improvements (like who knows, maybe 4D points!)

But first I want to hammer down the lingering doubts raised in this thread:

zero would indicate that the valid time of the coordinate is at the current epoch, i.e. approx 2000 years ago.

Is this really the case for 2-d users of libproj (e.g. someone using PJ_COORD { xy: }. libproj assumes they want 2000 year old data? That would be such a sad choice for the very large group of your users that only use 2-d points, who almost surely want something closer to today than to something 2000 years ago.

In any case, my feeling is that as "a wrapper for libproj", georust/proj is obliged to offer the same assumptions that libproj offers. It'd be a surprising bug if people are getting different results from libproj than they are from georust/proj.

(I don't think that a new library like https://github.com/busstoptaktik/geodesy should be doomed to carry forward any historical baggage like that though)

Based on this... I wonder if we should be initializing t to zero:

I think it's safe to do what libproj does, i.e. use INFINITY:

If I'm reading this correctly, when encountering z: INFINITY or t: INFINITY within proj_trans, it get's normalized to 0 anyway. 🙃

https://github.com/OSGeo/PROJ/blob/master/src/fwd.cpp#L49

    if (HUGE_VAL==coo.v[2] && P->helmert) coo.v[2] = 0.0;
    if (HUGE_VAL==coo.v[3] && P->helmert) coo.v[3] = 0.0;

Sooooo, anyone opposed to releasing to crates.io?

busstoptaktik · 2022-02-11T08:07:08Z

Is this really the case for 2-d users of libproj (e.g. someone using PJ_COORD { xy: }. libproj assumes they want 2000 year old data?

No, PROJ does not assume anything, but if you call a dynamic transformation with garbage data, you will get garbage answers. The main problem is the widespread use of EPSG:4326 as "don't care" value: Since WGS84 is a global datum, any transformation into plate-fixed coordinate systems (like ETRS89 in Europe, NAD83 in North America), will be through a dynamic transformation.

As long as you stay inside the same datum, and only change from (say) geographical coordinates to (say) UTM, there is no problem. The same is the case for transforming between different generations of plate fixed reference frames (e.g. ED50 to ETRS89): No problem at all, since the transformation is not time dependent.

So in conclusion: This PR solves a real problem, and should obviously be pushed, but it does not solve all potential problems: You can only do that by having a strict 4D processing pipeline, and I do not think this will be useful at all for most of the use cases of GeoRust.

Initialize ZT before calling proj_trans

beed043

bors bot added a commit that referenced this pull request Feb 4, 2022

Try #104:

c1375b8

Initialize more union members

ad404a0

bors bot added a commit that referenced this pull request Feb 4, 2022

Try #104:

f68ff9a

frewsxcv approved these changes Feb 6, 2022

View reviewed changes

bors bot merged commit dea916e into master Feb 10, 2022

lnicola deleted the proj_trans-init-input branch February 10, 2022 20:20

This was referenced Feb 11, 2022

upgrade to proj 8.2 #97

Closed

some failing tests #99

Closed

michaelkirk mentioned this pull request Mar 9, 2022

update bundled proj to 9.0.0 #96

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialize `zt` before calling `proj_trans` #104

Initialize `zt` before calling `proj_trans` #104

lnicola commented Feb 4, 2022

michaelkirk commented Feb 4, 2022

bors bot commented Feb 4, 2022

lnicola commented Feb 4, 2022

lnicola commented Feb 4, 2022

bors bot commented Feb 4, 2022

michaelkirk commented Feb 4, 2022

lnicola commented Feb 4, 2022 •

edited

Loading

frewsxcv Feb 6, 2022

lnicola Feb 6, 2022

busstoptaktik Feb 10, 2022

michaelkirk commented Feb 10, 2022

lnicola commented Feb 10, 2022

lnicola commented Feb 10, 2022

busstoptaktik commented Feb 10, 2022

lnicola commented Feb 10, 2022

lnicola commented Feb 10, 2022

michaelkirk commented Feb 10, 2022

lnicola commented Feb 10, 2022 •

edited

Loading

michaelkirk commented Feb 10, 2022

bors bot commented Feb 10, 2022

busstoptaktik commented Feb 10, 2022 •

edited

Loading

lnicola commented Feb 10, 2022

michaelkirk commented Feb 11, 2022

busstoptaktik commented Feb 11, 2022

Initialize zt before calling proj_trans #104

Initialize zt before calling proj_trans #104

Conversation

lnicola commented Feb 4, 2022

michaelkirk commented Feb 4, 2022

bors bot commented Feb 4, 2022

try

lnicola commented Feb 4, 2022

lnicola commented Feb 4, 2022

bors bot commented Feb 4, 2022

try

michaelkirk commented Feb 4, 2022

lnicola commented Feb 4, 2022 • edited Loading

frewsxcv Feb 6, 2022

Choose a reason for hiding this comment

lnicola Feb 6, 2022

Choose a reason for hiding this comment

busstoptaktik Feb 10, 2022

Choose a reason for hiding this comment

michaelkirk commented Feb 10, 2022

lnicola commented Feb 10, 2022

lnicola commented Feb 10, 2022

busstoptaktik commented Feb 10, 2022

lnicola commented Feb 10, 2022

lnicola commented Feb 10, 2022

michaelkirk commented Feb 10, 2022

lnicola commented Feb 10, 2022 • edited Loading

michaelkirk commented Feb 10, 2022

bors bot commented Feb 10, 2022

busstoptaktik commented Feb 10, 2022 • edited Loading

lnicola commented Feb 10, 2022

michaelkirk commented Feb 11, 2022

busstoptaktik commented Feb 11, 2022

Initialize `zt` before calling `proj_trans` #104

Initialize `zt` before calling `proj_trans` #104

lnicola commented Feb 4, 2022 •

edited

Loading

lnicola commented Feb 10, 2022 •

edited

Loading

busstoptaktik commented Feb 10, 2022 •

edited

Loading