Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PROJ_DATA env var should take precedence over installation data #1448

Closed
kidanger opened this issue Oct 5, 2024 · 3 comments
Closed

PROJ_DATA env var should take precedence over installation data #1448

kidanger opened this issue Oct 5, 2024 · 3 comments
Labels
installation-issues Issue related to installation problems. proposal Idea for a new feature. question

Comments

@kidanger
Copy link

kidanger commented Oct 5, 2024

Hello,

Currently, setting the environment variable PROJ_DATA has no effect on pyproj when the installation of pyproj brings its own data. I think it would be good to lower the priority of the internal data, and let users override the proj data with the environment variable in more cases.

Example: (from a fresh virtual env, python 3.12)

$ pip install pyproj
...
Successfully installed certifi-2024.8.30 pyproj-3.7.0
$ # create a custom proj data dir, here just a copy of the default one
$ cp -r .venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj test/

$ # without env var, pyproj finds the its own data directory
$ pyproj -v
pyproj info:
    pyproj: 3.7.0
PROJ (runtime): 9.4.1
PROJ (compiled): 9.4.1
  data dir: /tmp/t/.venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj
...

$ # even with the env var, it uses its own directory
$ PROJ_DATA=test/ pyproj -v
...
  data dir: /tmp/t/.venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj
...

$ # remove the internal dir manually, now it works
$ rm -fr .venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj
$ PROJ_DATA=test/ pyproj -v
...
  data dir: test/
...

(related discussion: NixOS/nixpkgs#282139)

@kidanger kidanger added the proposal Idea for a new feature. label Oct 5, 2024
@snowman2
Copy link
Member

snowman2 commented Oct 5, 2024

This is by design. The reason this is the case is to prevent using the PROJ_DIR for a different PROJ installation that is incompatible. The PROJ database must be the one provided for that specific PROJ version and should not be interchanged.

If you have a separate PROJ installation, you should install pyproj from source instead of from a wheel if that is what you would like to use.

https://pyproj4.github.io/pyproj/stable/api/datadir.html

@snowman2 snowman2 added question installation-issues Issue related to installation problems. labels Oct 5, 2024
@kidanger
Copy link
Author

kidanger commented Oct 5, 2024

Thank you for the fast answer.

Then I'm not sure why pyproj.datadir.set_data_dir would have precedence over pyproj internal data but PROJ_DATA doesn't, but I don't know all the details of pyproj and proj. Maybe this is not the goal of PROJ_DATA. My use-case is to bundle specific datum grids during the distribution of a software, to avoid network downloads or relying on user folders.

Feel free to close the issue, if the behavior in intended.

@snowman2
Copy link
Member

snowman2 commented Oct 5, 2024

I'm not sure why pyproj.datadir.set_data_dir would have precedence over pyproj internal data but PROJ_DATA doesn't

The reason set_data_dir exists is to set the data directory if it cannot be found automatically. It is guaranteed to be for the specific instance of pyproj and not for another installation of PROJ.

With multiple installations of PROJ on a single machine, PROJ_DATA could potentially point to an incorrect directory that shouldn't be used by pyproj.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
installation-issues Issue related to installation problems. proposal Idea for a new feature. question
Projects
None yet
Development

No branches or pull requests

2 participants