You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, thank you for your excellent work
During the operation, I encountered some errors
The following is the environment I installed
`name: nerfmatch
channels:
pytorch
nvidia
defaults
dependencies:
_libgcc_mutex=0.1=main
_openmp_mutex=5.1=1_gnu
blas=1.0=mkl
brotli-python=1.0.9=py39h6a678d5_8
bzip2=1.0.8=h5eee18b_6
ca-certificates=2024.9.24=h06a4308_0
certifi=2024.8.30=py39h06a4308_0
charset-normalizer=3.3.2=pyhd3eb1b0_0
cuda-cudart=11.7.99=0
cuda-cupti=11.7.101=0
cuda-libraries=11.7.1=0
cuda-nvrtc=11.7.99=0
cuda-nvtx=11.7.91=0
cuda-runtime=11.7.1=0
cuda-version=12.6=3
ffmpeg=4.3=hf484d3e_0
filelock=3.13.1=py39h06a4308_0
freetype=2.12.1=h4a9f257_0
gmp=6.2.1=h295c915_3
gmpy2=2.1.2=py39heeb90bb_0
gnutls=3.6.15=he1e5248_0
idna=3.7=py39h06a4308_0
intel-openmp=2023.1.0=hdb19cb5_46306
jinja2=3.1.4=py39h06a4308_0
jpeg=9e=h5eee18b_3
lame=3.100=h7b6447c_0
lcms2=2.12=h3be6417_0
ld_impl_linux-64=2.40=h12ee557_0
lerc=3.0=h295c915_0
libcublas=11.10.3.66=0
libcufft=10.7.2.124=h4fbf590_0
libcufile=1.11.1.6=0
libcurand=10.3.7.77=0
libcusolver=11.4.0.1=0
libcusparse=11.7.4.91=0
libdeflate=1.17=h5eee18b_1
libffi=3.4.4=h6a678d5_1
libgcc-ng=11.2.0=h1234567_1
libgomp=11.2.0=h1234567_1
libiconv=1.16=h5eee18b_3
libidn2=2.3.4=h5eee18b_0
libnpp=11.7.4.75=0
libnvjpeg=11.8.0.2=0
libpng=1.6.39=h5eee18b_0
libstdcxx-ng=11.2.0=h1234567_1
libtasn1=4.19.0=h5eee18b_0
libtiff=4.5.1=h6a678d5_0
libunistring=0.9.10=h27cfd23_0
libwebp-base=1.3.2=h5eee18b_1
lz4-c=1.9.4=h6a678d5_1
markupsafe=2.1.3=py39h5eee18b_0
mkl=2023.1.0=h213fc3f_46344
mkl-service=2.4.0=py39h5eee18b_1
mkl_fft=1.3.10=py39h5eee18b_0
mkl_random=1.2.7=py39h1128e8f_0
mpc=1.1.0=h10f8cd9_1
mpfr=4.0.2=hb69a4c5_1
mpmath=1.3.0=py39h06a4308_0
ncurses=6.4=h6a678d5_0
nettle=3.7.3=hbbd107a_1
networkx=3.2.1=py39h06a4308_0
openh264=2.1.1=h4ff587b_0
openjpeg=2.5.2=he7f1fd0_0
openssl=3.0.15=h5eee18b_0
pillow=10.4.0=py39h5eee18b_0
pysocks=1.7.1=py39h06a4308_0
python=3.9.20=he870216_1
pytorch=2.0.1=py3.9_cuda11.7_cudnn8.5.0_0
pytorch-cuda=11.7=h778d358_5
pytorch-mutex=1.0=cuda
readline=8.2=h5eee18b_0
requests=2.32.3=py39h06a4308_0
sqlite=3.45.3=h5eee18b_0
sympy=1.13.2=py39h06a4308_0
tbb=2021.8.0=hdb19cb5_0
tk=8.6.14=h39e8969_0
torchaudio=2.0.2=py39_cu117
torchtriton=2.0.0=py39
torchvision=0.15.2=py39_cu117
typing_extensions=4.11.0=py39h06a4308_0
tzdata=2024b=h04d1e81_0
urllib3=2.2.3=py39h06a4308_0
wheel=0.44.0=py39h06a4308_0
xz=5.4.6=h5eee18b_1
zlib=1.2.13=h5eee18b_1
zstd=1.5.6=hc292b87_0
pip:
absl-py==2.1.0
aiohappyeyeballs==2.4.3
aiohttp==3.10.10
aiosignal==1.3.1
anyio==4.6.2.post1
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
async-timeout==4.0.3
attrs==24.2.0
babel==2.16.0
beautifulsoup4==4.12.3
bleach==6.1.0
cffi==1.17.1
comm==0.2.2
contourpy==1.3.0
cycler==0.12.1
debugpy==1.8.7
decorator==5.1.1
defusedxml==0.7.1
einops==0.8.0
exceptiongroup==1.2.2
executing==2.1.0
fastjsonschema==2.20.0
fonttools==4.54.1
fqdn==1.5.1
frozenlist==1.5.0
fsspec==2024.10.0
future==1.0.0
grpcio==1.67.0
h11==0.14.0
h5py==3.12.1
httpcore==1.0.6
httpx==0.27.2
huggingface-hub==0.26.1
imageio==2.36.0
imgviz==1.7.5
importlib-metadata==8.5.0
importlib-resources==6.4.5
ipykernel==6.29.5
ipython==8.18.1
ipywidgets==8.1.5
isoduration==20.11.0
jedi==0.19.1
joblib==1.4.2
json5==0.9.25
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
jupyter==1.1.1
jupyter-client==8.6.3
jupyter-console==6.6.3
jupyter-core==5.7.2
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter-server==2.14.2
jupyter-server-terminals==0.5.3
jupyterlab==4.2.5
jupyterlab-pygments==0.3.0
jupyterlab-server==2.27.3
jupyterlab-widgets==3.0.13
kiwisolver==1.4.7
kornia==0.7.3
kornia-rs==0.1.5
lazy-loader==0.4
lightning-utilities==0.11.8
loguru==0.7.2
markdown==3.7
markdown-it-py==3.0.0
matplotlib==3.9.2
matplotlib-inline==0.1.7
mdurl==0.1.2
mistune==3.0.2
multidict==6.1.0
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nerfacc==0.5.3
nest-asyncio==1.6.0
notebook==7.2.2
notebook-shim==0.2.4
numpy==1.24.0
opencv-contrib-python==4.10.0.84
opencv-python==4.10.0.84
overrides==7.7.0
packaging==24.1
pandocfilters==1.5.1
parso==0.8.4
pexpect==4.9.0
pip==23.2.1
platformdirs==4.3.6
prometheus-client==0.21.0
prompt-toolkit==3.0.48
propcache==0.2.0
protobuf==5.28.3
psutil==6.1.0
ptyprocess==0.7.0
pure-eval==0.2.3
pycolmap==0.4.0
pycparser==2.22
pydeprecate==0.3.1
pygments==2.18.0
pyparsing==3.2.0
python-dateutil==2.9.0.post0
python-json-logger==2.0.7
pytorch-lightning==1.5.10
pyyaml==6.0.2
pyzmq==26.2.0
referencing==0.35.1
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.9.3
rpds-py==0.20.0
safetensors==0.4.5
scikit-image==0.24.0
scipy==1.13.1
send2trash==1.8.3
setuptools==59.5.0
six==1.16.0
sniffio==1.3.1
soupsieve==2.6
stack-data==0.6.3
tensorboard==2.18.0
tensorboard-data-server==0.7.2
terminado==0.18.1
tifffile==2024.8.30
timm==1.0.11
tinycss2==1.4.0
tomli==2.0.2
torchmetrics==1.5.1
tornado==6.4.1
tqdm==4.66.5
traitlets==5.14.3
transforms3d==0.4.2
types-python-dateutil==2.9.0.20241003
uri-template==1.3.0
wcwidth==0.2.13
webcolors==24.8.0
webencodings==0.5.1
websocket-client==1.8.0
werkzeug==3.0.6
widgetsnbextension==4.0.13
yacs==0.1.8
yarl==1.16.0
zipp==3.20.2
prefix: /data/users/yuxuanhan/anaconda3/envs/nerfmatch
`
NO.1 NeRF Training (Optional),I downloaded the pre trained NERF model
NO.2 Cache NeRF Features # Cambridge
python -m model_eval.eval_nerf --cache_scene_pts --split 'train_test'
--downsample 8 --img_wh 480 480 --stop_layer 3
--ckpt 'pretrained/nerf/cambridge/mip_app/#scene_last.ckpt'
--scene_anno_path 'data/annotations/cambridge_jsons/transforms_#scene_#split.json'
--cache_dir 'outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep'
--dataset 'cambridge'
NO.3 NeRFMatch train
torchrun --nproc_per_node=8 model_train/train_nerfmatch_c2f.py
--config configs/nerfmatch/nerfmatch_cambridge_c2f.yaml
--backbone 'convformer384' --temp_type 'mul' --batch_size 2
--max_epochs 50 --clr 0.0004 --cbs 16 --pair_topk 20 --aug_self_pairs 10
--scene_dir 'outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin'
--resume_version 'mip_app_inter3_last' --update_conf
--prefix 'eccv/repr' --scenes 'ShopFacade'
but I changed the number of GPUs to 4 :nproc_per_node=8 to nproc_per_node=4
Here are the errors I made:(nerfmatch) y@hello-PowerEdge-T640:~/nerfmatch$ torchrun --nproc_per_node=4 model_train/train_nerfmatch_c2f.py \
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 789, in urlopen
response = self._make_request(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 490, in _make_request
raise new_e
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self._validate_conn(conn)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
conn.connect()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 693, in connect
self.sock = sock = self._new_conn()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 214, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7fb69c2114f0>: Failed to establish a new connection: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 843, in urlopen
retries = retries.increment(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fb69c2114f0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
r = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 300, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fb69c2114f0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 8b46855a-41c5-43a8-9924-194ea3638c51)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 110, in
main()
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 106, in main
train(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 863, in train
model = NeRFMatchMSTrainer(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 561, in init
self.model = NeRFMatcherMS(model_conf)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 84, in init
self.backbone = init_backbone_8_2(
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 112, in init_backbone_8_2
backbone = MetaFormer_MS(name, pretrained=pretrained)
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 28, in init
model = timm.create_model(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_factory.py", line 117, in create_model
model = create_fn(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 1015, in convformer_b36
return _create_metaformer('convformer_b36', pretrained=pretrained, **model_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 663, in _create_metaformer
model = build_model_with_cfg(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 427, in build_model_with_cfg
load_pretrained(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 205, in load_pretrained
state_dict = load_state_dict_from_hf(pretrained_loc, weights_only=True)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_hub.py", line 192, in load_state_dict_from_hf
cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_revision)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 969, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1487, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 789, in urlopen
response = self._make_request(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 490, in _make_request
raise new_e
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self._validate_conn(conn)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
conn.connect()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 693, in connect
self.sock = sock = self._new_conn()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 214, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7fd0e00a5e80>: Failed to establish a new connection: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 843, in urlopen
retries = retries.increment(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd0e00a5e80>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
r = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 300, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd0e00a5e80>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 043a0871-d157-4b41-af22-78462b5cebe5)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 110, in
main()
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 106, in main
train(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 863, in train
model = NeRFMatchMSTrainer(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 561, in init
self.model = NeRFMatcherMS(model_conf)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 84, in init
self.backbone = init_backbone_8_2(
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 112, in init_backbone_8_2
backbone = MetaFormer_MS(name, pretrained=pretrained)
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 28, in init
model = timm.create_model(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_factory.py", line 117, in create_model
model = create_fn(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 1015, in convformer_b36
return _create_metaformer('convformer_b36', pretrained=pretrained, **model_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 663, in _create_metaformer
model = build_model_with_cfg(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 427, in build_model_with_cfg
load_pretrained(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 205, in load_pretrained
state_dict = load_state_dict_from_hf(pretrained_loc, weights_only=True)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_hub.py", line 192, in load_state_dict_from_hf
cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_revision)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 969, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1487, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 789, in urlopen
response = self._make_request(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 490, in _make_request
raise new_e
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self._validate_conn(conn)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
conn.connect()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 693, in connect
self.sock = sock = self._new_conn()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 214, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f903c3a4cd0>: Failed to establish a new connection: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 843, in urlopen
retries = retries.increment(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f903c3a4cd0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
r = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 300, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 700, in send
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f903c3a4cd0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: a067b81f-3bb2-480c-82a2-2ef3f404ae32)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 110, in
raise err
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
main()
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 106, in main
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 789, in urlopen
train(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 863, in train
response = self._make_request(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 490, in _make_request
model = NeRFMatchMSTrainer(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 561, in init
raise new_e
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self.model = NeRFMatcherMS(model_conf)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 84, in init
self.backbone = init_backbone_8_2(
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 112, in init_backbone_8_2
self._validate_conn(conn)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
backbone = MetaFormer_MS(name, pretrained=pretrained)
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 28, in init
model = timm.create_model(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_factory.py", line 117, in create_model
model = create_fn(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 1015, in convformer_b36
conn.connect()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 693, in connect
return _create_metaformer('convformer_b36', pretrained=pretrained, **model_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 663, in _create_metaformer
self.sock = sock = self._new_conn()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 214, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f97000e1e20>: Failed to establish a new connection: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 667, in send
model = build_model_with_cfg(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 427, in build_model_with_cfg
load_pretrained(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 205, in load_pretrained
resp = conn.urlopen(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 843, in urlopen
state_dict = load_state_dict_from_hf(pretrained_loc, weights_only=True)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_hub.py", line 192, in load_state_dict_from_hf
cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_revision)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
retries = retries.increment( File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f97000e1e20>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
return _hf_hub_download_to_cache_dir(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 969, in _hf_hub_download_to_cache_dir
metadata = get_hf_file_metadata(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1487, in _raise_on_head_call_error
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError : r = _request_wrapper(An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 300, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f97000e1e20>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 58e87945-68d4-4103-8105-29a6f6748c43)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 110, in
main()
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 106, in main
train(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 863, in train
model = NeRFMatchMSTrainer(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 561, in init
self.model = NeRFMatcherMS(model_conf)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 84, in init
self.backbone = init_backbone_8_2(
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 112, in init_backbone_8_2
backbone = MetaFormer_MS(name, pretrained=pretrained)
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 28, in init
model = timm.create_model(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_factory.py", line 117, in create_model
model = create_fn(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 1015, in convformer_b36
return _create_metaformer('convformer_b36', pretrained=pretrained, **model_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 663, in _create_metaformer
model = build_model_with_cfg(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 427, in build_model_with_cfg
load_pretrained(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 205, in load_pretrained
state_dict = load_state_dict_from_hf(pretrained_loc, weights_only=True)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_hub.py", line 192, in load_state_dict_from_hf
cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_revision)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 969, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1487, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 953941 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 953940) of binary: /data/users/y/anaconda3/envs/nerfmatch/bin/python
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==2.0.1', 'console_scripts', 'torchrun')())
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
excuse me
1In this error, there is a remote connection to huggingface. co, is it because the link cannot be reached?
2In the environment configuration, the version in the PyTorch Lightning official documentation does not correspond to the version of Torch
Torch is 2.0, PL version requires 2.0 or above, is this an incorrect reason? Version Corresponding Query https://lightning.ai/docs/pytorch/latest/versioning.html#pytorch -support
如果是问题1的话,请问需要什么方式解决呢?
那问题2需不需要改动呢?如果pl的版本进行升级之后,代码也需要改动,因为替换高版本的pl会出现以下错误
ImportError: cannot import name ‘DDPPlugin‘ from pytorch_lightning.plugins
谢谢
The text was updated successfully, but these errors were encountered:
Hello, thank you for your excellent work
During the operation, I encountered some errors
The following is the environment I installed
`name: nerfmatch
channels:
dependencies:
prefix: /data/users/yuxuanhan/anaconda3/envs/nerfmatch
`
NO.1 NeRF Training (Optional),I downloaded the pre trained NERF model
NO.2 Cache NeRF Features # Cambridge
python -m model_eval.eval_nerf --cache_scene_pts --split 'train_test'
--downsample 8 --img_wh 480 480 --stop_layer 3
--ckpt 'pretrained/nerf/cambridge/mip_app/#scene_last.ckpt'
--scene_anno_path 'data/annotations/cambridge_jsons/transforms_#scene_#split.json'
--cache_dir 'outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep'
--dataset 'cambridge'
NO.3 NeRFMatch train
torchrun --nproc_per_node=8 model_train/train_nerfmatch_c2f.py
--config configs/nerfmatch/nerfmatch_cambridge_c2f.yaml
--backbone 'convformer384' --temp_type 'mul' --batch_size 2
--max_epochs 50 --clr 0.0004 --cbs 16 --pair_topk 20 --aug_self_pairs 10
--scene_dir 'outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin'
--resume_version 'mip_app_inter3_last' --update_conf
--prefix 'eccv/repr' --scenes 'ShopFacade'
but I changed the number of GPUs to 4 :nproc_per_node=8 to nproc_per_node=4
Here are the errors I made:(nerfmatch) y@hello-PowerEdge-T640:~/nerfmatch$ torchrun --nproc_per_node=4 model_train/train_nerfmatch_c2f.py \
WARNING:torch.distributed.run:
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
Global seed set to 12343
True batch: 8 lr: 0.0002
[2024-10-29 09:13:25|trainer|INFO]: Namespace(data=Namespace(dataset='NeRFMatchPair', data_dir='data/cambridge', scenes=['ShopFacade'], scene_anno_path='data/annotations/cambridge_jsons/transforms_#scene_#split.json', scene_dir='outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin', train_pair_txt='data/pairs/cambridge/#scene/pairs-db-covis20.txt', test_pair_txt='data/pairs/cambridge/#scene/pairs-query-netvlad10.txt', pair_topk=20, img_wh=[480, 480], img_dim=3, use_msk=False, model_ds=8, imagenet_norm=True, balanced_pair=True, epoch_sample_num=10000, aug_self_pairs=10), optim=Namespace(optimizer='adam', adapt_lr=True, clr=0.0004, cbs=16, weight_decay=0.0, lr_scheduler='cosine', coarse_only_epochs=0, max_epochs=50, lr=0.0002), model=Namespace(backbone='convformer384', pretrained=True, im_pe=True, im_sa_type='share', im_sa=3, temp_type='mul', pt_sa=3, pt_dim=256, pt_sa_type='full', pt_pe=True, pt_pe_type='fourier', post_pt_pe=True, cfeat_dim=256, ffeat_dim=128, cformer_type='crs', coarse_layers=1, pt_ftype='nerf', fine_sa=1, fsa_type='full', win_sz=5, cat_c_feat=True, fine_loss='match', coarse_percent=0.3, coarse_dthres=10, coarse_ckpt=None, c2f_ckpt=None), exp=Namespace(seed=12343, odir=PosixPath('outputs/nerfmatch/c2f/cambridge'), prefix='eccv/repr', resume_version='mip_app_inter3_last', num_workers=4, max_epochs=50, check_epochs=1, batch_size=2, debug=False, name='eccv/repr/NeRFMatchPair_ShopFacade_wh480-480ds8lin_top20ep10000_bala_imn_slfaug10/convformer384_pre_imp_ptp_ptfull3_pepos_imsa_share_cfcrs1d256_multmp_fsafull1d128w5_catc_match0.3d10/g4clr0.0004cbs16adamcosine_ep50'), gpus=-1, prefix='eccv/repr', debug=False, config='configs/nerfmatch/nerfmatch_cambridge_c2f.yaml', coarse_ckpt=None, c2f_ckpt=None, backbone='convformer384', cformer_type='crs', coarse_layers=1, pt_sa=3, im_sa=3, pt_dim=256, cfeat_dim=256, pt_pe=True, im_pe=True, im_sa_type='share', pt_sa_type='full', pt_ftype='nerf', pt_pe_type='fourier', temp_type='mul', fine_sa=1, fsa_type='full', update_conf=True, batch_size=2, clr=0.0004, cbs=16, adapt_lr=False, max_epochs=50, coarse_only_epochs=0, epoch_sample_num=10000, pair_topk=20, aug_self_pairs=10, train_pair_txt=None, scene_dir='outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin', scenes=['ShopFacade'], resume_version='mip_app_inter3_last', gpu_num=4)
[2024-10-29 09:13:25|trainer|INFO]: # GPUs=4 <pytorch_lightning.plugins.training_type.ddp.DDPPlugin object at 0x7fd0f04e1e80>
Global seed set to 12343
Global seed set to 12343
True batch: 8 lr: 0.0002
[2024-10-29 09:13:25|trainer|INFO]: Namespace(data=Namespace(dataset='NeRFMatchPair', data_dir='data/cambridge', scenes=['ShopFacade'], scene_anno_path='data/annotations/cambridge_jsons/transforms_#scene_#split.json', scene_dir='outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin', train_pair_txt='data/pairs/cambridge/#scene/pairs-db-covis20.txt', test_pair_txt='data/pairs/cambridge/#scene/pairs-query-netvlad10.txt', pair_topk=20, img_wh=[480, 480], img_dim=3, use_msk=False, model_ds=8, imagenet_norm=True, balanced_pair=True, epoch_sample_num=10000, aug_self_pairs=10), optim=Namespace(optimizer='adam', adapt_lr=True, clr=0.0004, cbs=16, weight_decay=0.0, lr_scheduler='cosine', coarse_only_epochs=0, max_epochs=50, lr=0.0002), model=Namespace(backbone='convformer384', pretrained=True, im_pe=True, im_sa_type='share', im_sa=3, temp_type='mul', pt_sa=3, pt_dim=256, pt_sa_type='full', pt_pe=True, pt_pe_type='fourier', post_pt_pe=True, cfeat_dim=256, ffeat_dim=128, cformer_type='crs', coarse_layers=1, pt_ftype='nerf', fine_sa=1, fsa_type='full', win_sz=5, cat_c_feat=True, fine_loss='match', coarse_percent=0.3, coarse_dthres=10, coarse_ckpt=None, c2f_ckpt=None), exp=Namespace(seed=12343, odir=PosixPath('outputs/nerfmatch/c2f/cambridge'), prefix='eccv/repr', resume_version='mip_app_inter3_last', num_workers=4, max_epochs=50, check_epochs=1, batch_size=2, debug=False, name='eccv/repr/NeRFMatchPair_ShopFacade_wh480-480ds8lin_top20ep10000_bala_imn_slfaug10/convformer384_pre_imp_ptp_ptfull3_pepos_imsa_share_cfcrs1d256_multmp_fsafull1d128w5_catc_match0.3d10/g4clr0.0004cbs16adamcosine_ep50'), gpus=-1, prefix='eccv/repr', debug=False, config='configs/nerfmatch/nerfmatch_cambridge_c2f.yaml', coarse_ckpt=None, c2f_ckpt=None, backbone='convformer384', cformer_type='crs', coarse_layers=1, pt_sa=3, im_sa=3, pt_dim=256, cfeat_dim=256, pt_pe=True, im_pe=True, im_sa_type='share', pt_sa_type='full', pt_ftype='nerf', pt_pe_type='fourier', temp_type='mul', fine_sa=1, fsa_type='full', update_conf=True, batch_size=2, clr=0.0004, cbs=16, adapt_lr=False, max_epochs=50, coarse_only_epochs=0, epoch_sample_num=10000, pair_topk=20, aug_self_pairs=10, train_pair_txt=None, scene_dir='outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin', scenes=['ShopFacade'], resume_version='mip_app_inter3_last', gpu_num=4)
True batch: 8 lr: 0.0002
[2024-10-29 09:13:25|trainer|INFO]: # GPUs=4 <pytorch_lightning.plugins.training_type.ddp.DDPPlugin object at 0x7f97098d9d30>
[2024-10-29 09:13:25|trainer|INFO]: Namespace(data=Namespace(dataset='NeRFMatchPair', data_dir='data/cambridge', scenes=['ShopFacade'], scene_anno_path='data/annotations/cambridge_jsons/transforms_#scene_#split.json', scene_dir='outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin', train_pair_txt='data/pairs/cambridge/#scene/pairs-db-covis20.txt', test_pair_txt='data/pairs/cambridge/#scene/pairs-query-netvlad10.txt', pair_topk=20, img_wh=[480, 480], img_dim=3, use_msk=False, model_ds=8, imagenet_norm=True, balanced_pair=True, epoch_sample_num=10000, aug_self_pairs=10), optim=Namespace(optimizer='adam', adapt_lr=True, clr=0.0004, cbs=16, weight_decay=0.0, lr_scheduler='cosine', coarse_only_epochs=0, max_epochs=50, lr=0.0002), model=Namespace(backbone='convformer384', pretrained=True, im_pe=True, im_sa_type='share', im_sa=3, temp_type='mul', pt_sa=3, pt_dim=256, pt_sa_type='full', pt_pe=True, pt_pe_type='fourier', post_pt_pe=True, cfeat_dim=256, ffeat_dim=128, cformer_type='crs', coarse_layers=1, pt_ftype='nerf', fine_sa=1, fsa_type='full', win_sz=5, cat_c_feat=True, fine_loss='match', coarse_percent=0.3, coarse_dthres=10, coarse_ckpt=None, c2f_ckpt=None), exp=Namespace(seed=12343, odir=PosixPath('outputs/nerfmatch/c2f/cambridge'), prefix='eccv/repr', resume_version='mip_app_inter3_last', num_workers=4, max_epochs=50, check_epochs=1, batch_size=2, debug=False, name='eccv/repr/NeRFMatchPair_ShopFacade_wh480-480ds8lin_top20ep10000_bala_imn_slfaug10/convformer384_pre_imp_ptp_ptfull3_pepos_imsa_share_cfcrs1d256_multmp_fsafull1d128w5_catc_match0.3d10/g4clr0.0004cbs16adamcosine_ep50'), gpus=-1, prefix='eccv/repr', debug=False, config='configs/nerfmatch/nerfmatch_cambridge_c2f.yaml', coarse_ckpt=None, c2f_ckpt=None, backbone='convformer384', cformer_type='crs', coarse_layers=1, pt_sa=3, im_sa=3, pt_dim=256, cfeat_dim=256, pt_pe=True, im_pe=True, im_sa_type='share', pt_sa_type='full', pt_ftype='nerf', pt_pe_type='fourier', temp_type='mul', fine_sa=1, fsa_type='full', update_conf=True, batch_size=2, clr=0.0004, cbs=16, adapt_lr=False, max_epochs=50, coarse_only_epochs=0, epoch_sample_num=10000, pair_topk=20, aug_self_pairs=10, train_pair_txt=None, scene_dir='outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin', scenes=['ShopFacade'], resume_version='mip_app_inter3_last', gpu_num=4)
[2024-10-29 09:13:25|trainer|INFO]: # GPUs=4 <pytorch_lightning.plugins.training_type.ddp.DDPPlugin object at 0x7fb6bdea8040>
Global seed set to 12343
True batch: 8 lr: 0.0002
[2024-10-29 09:13:25|trainer|INFO]: Namespace(data=Namespace(dataset='NeRFMatchPair', data_dir='data/cambridge', scenes=['ShopFacade'], scene_anno_path='data/annotations/cambridge_jsons/transforms_#scene_#split.json', scene_dir='outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin', train_pair_txt='data/pairs/cambridge/#scene/pairs-db-covis20.txt', test_pair_txt='data/pairs/cambridge/#scene/pairs-query-netvlad10.txt', pair_topk=20, img_wh=[480, 480], img_dim=3, use_msk=False, model_ds=8, imagenet_norm=True, balanced_pair=True, epoch_sample_num=10000, aug_self_pairs=10), optim=Namespace(optimizer='adam', adapt_lr=True, clr=0.0004, cbs=16, weight_decay=0.0, lr_scheduler='cosine', coarse_only_epochs=0, max_epochs=50, lr=0.0002), model=Namespace(backbone='convformer384', pretrained=True, im_pe=True, im_sa_type='share', im_sa=3, temp_type='mul', pt_sa=3, pt_dim=256, pt_sa_type='full', pt_pe=True, pt_pe_type='fourier', post_pt_pe=True, cfeat_dim=256, ffeat_dim=128, cformer_type='crs', coarse_layers=1, pt_ftype='nerf', fine_sa=1, fsa_type='full', win_sz=5, cat_c_feat=True, fine_loss='match', coarse_percent=0.3, coarse_dthres=10, coarse_ckpt=None, c2f_ckpt=None), exp=Namespace(seed=12343, odir=PosixPath('outputs/nerfmatch/c2f/cambridge'), prefix='eccv/repr', resume_version='mip_app_inter3_last', num_workers=4, max_epochs=50, check_epochs=1, batch_size=2, debug=False, name='eccv/repr/NeRFMatchPair_ShopFacade_wh480-480ds8lin_top20ep10000_bala_imn_slfaug10/convformer384_pre_imp_ptp_ptfull3_pepos_imsa_share_cfcrs1d256_multmp_fsafull1d128w5_catc_match0.3d10/g4clr0.0004cbs16adamcosine_ep50'), gpus=-1, prefix='eccv/repr', debug=False, config='configs/nerfmatch/nerfmatch_cambridge_c2f.yaml', coarse_ckpt=None, c2f_ckpt=None, backbone='convformer384', cformer_type='crs', coarse_layers=1, pt_sa=3, im_sa=3, pt_dim=256, cfeat_dim=256, pt_pe=True, im_pe=True, im_sa_type='share', pt_sa_type='full', pt_ftype='nerf', pt_pe_type='fourier', temp_type='mul', fine_sa=1, fsa_type='full', update_conf=True, batch_size=2, clr=0.0004, cbs=16, adapt_lr=False, max_epochs=50, coarse_only_epochs=0, epoch_sample_num=10000, pair_topk=20, aug_self_pairs=10, train_pair_txt=None, scene_dir='outputs/scene_dirs/cambridge/inter_layer3/#scene/mip_app/last_15ep/ds8lin', scenes=['ShopFacade'], resume_version='mip_app_inter3_last', gpu_num=4)
[2024-10-29 09:13:25|trainer|INFO]: # GPUs=4 <pytorch_lightning.plugins.training_type.ddp.DDPPlugin object at 0x7f9058fc6b20>
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 789, in urlopen
response = self._make_request(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 490, in _make_request
raise new_e
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self._validate_conn(conn)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
conn.connect()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 693, in connect
self.sock = sock = self._new_conn()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 214, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7fb69c2114f0>: Failed to establish a new connection: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 843, in urlopen
retries = retries.increment(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fb69c2114f0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
r = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 300, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fb69c2114f0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 8b46855a-41c5-43a8-9924-194ea3638c51)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 110, in
main()
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 106, in main
train(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 863, in train
model = NeRFMatchMSTrainer(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 561, in init
self.model = NeRFMatcherMS(model_conf)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 84, in init
self.backbone = init_backbone_8_2(
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 112, in init_backbone_8_2
backbone = MetaFormer_MS(name, pretrained=pretrained)
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 28, in init
model = timm.create_model(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_factory.py", line 117, in create_model
model = create_fn(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 1015, in convformer_b36
return _create_metaformer('convformer_b36', pretrained=pretrained, **model_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 663, in _create_metaformer
model = build_model_with_cfg(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 427, in build_model_with_cfg
load_pretrained(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 205, in load_pretrained
state_dict = load_state_dict_from_hf(pretrained_loc, weights_only=True)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_hub.py", line 192, in load_state_dict_from_hf
cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_revision)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 969, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1487, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 789, in urlopen
response = self._make_request(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 490, in _make_request
raise new_e
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self._validate_conn(conn)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
conn.connect()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 693, in connect
self.sock = sock = self._new_conn()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 214, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7fd0e00a5e80>: Failed to establish a new connection: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 843, in urlopen
retries = retries.increment(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd0e00a5e80>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
r = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 300, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fd0e00a5e80>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 043a0871-d157-4b41-af22-78462b5cebe5)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 110, in
main()
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 106, in main
train(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 863, in train
model = NeRFMatchMSTrainer(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 561, in init
self.model = NeRFMatcherMS(model_conf)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 84, in init
self.backbone = init_backbone_8_2(
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 112, in init_backbone_8_2
backbone = MetaFormer_MS(name, pretrained=pretrained)
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 28, in init
model = timm.create_model(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_factory.py", line 117, in create_model
model = create_fn(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 1015, in convformer_b36
return _create_metaformer('convformer_b36', pretrained=pretrained, **model_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 663, in _create_metaformer
model = build_model_with_cfg(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 427, in build_model_with_cfg
load_pretrained(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 205, in load_pretrained
state_dict = load_state_dict_from_hf(pretrained_loc, weights_only=True)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_hub.py", line 192, in load_state_dict_from_hf
cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_revision)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 969, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1487, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise err
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 789, in urlopen
response = self._make_request(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 490, in _make_request
raise new_e
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self._validate_conn(conn)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
conn.connect()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 693, in connect
self.sock = sock = self._new_conn()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 214, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f903c3a4cd0>: Failed to establish a new connection: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 667, in send
resp = conn.urlopen(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 843, in urlopen
retries = retries.increment(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f903c3a4cd0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
r = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 300, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 700, in send
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 199, in _new_conn
sock = connection.create_connection(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f903c3a4cd0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: a067b81f-3bb2-480c-82a2-2ef3f404ae32)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 110, in
raise err
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/connection.py", line 73, in create_connection
main()
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 106, in main
sock.connect(sa)
OSError: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 789, in urlopen
train(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 863, in train
response = self._make_request(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 490, in _make_request
model = NeRFMatchMSTrainer(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 561, in init
raise new_e
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 466, in _make_request
self.model = NeRFMatcherMS(model_conf)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 84, in init
self.backbone = init_backbone_8_2(
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 112, in init_backbone_8_2
self._validate_conn(conn)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
backbone = MetaFormer_MS(name, pretrained=pretrained)
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 28, in init
model = timm.create_model(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_factory.py", line 117, in create_model
model = create_fn(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 1015, in convformer_b36
conn.connect()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 693, in connect
return _create_metaformer('convformer_b36', pretrained=pretrained, **model_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 663, in _create_metaformer
self.sock = sock = self._new_conn()
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connection.py", line 214, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f97000e1e20>: Failed to establish a new connection: [Errno 101] Network is unreachable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 667, in send
model = build_model_with_cfg(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 427, in build_model_with_cfg
load_pretrained(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 205, in load_pretrained
resp = conn.urlopen(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/connectionpool.py", line 843, in urlopen
state_dict = load_state_dict_from_hf(pretrained_loc, weights_only=True)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_hub.py", line 192, in load_state_dict_from_hf
cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_revision)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
retries = retries.increment( File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/urllib3/util/retry.py", line 519, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f97000e1e20>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1376, in _get_metadata_or_catch_error
return _hf_hub_download_to_cache_dir(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 969, in _hf_hub_download_to_cache_dir
metadata = get_hf_file_metadata(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1487, in _raise_on_head_call_error
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1296, in get_hf_file_metadata
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError : r = _request_wrapper(An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 277, in _request_wrapper
response = _request_wrapper(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 300, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_http.py", line 93, in send
return super().send(request, *args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/requests/adapters.py", line 700, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /timm/convformer_b36.sail_in1k_384/resolve/main/pytorch_model.bin (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f97000e1e20>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 58e87945-68d4-4103-8105-29a6f6748c43)')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 110, in
main()
File "/data/users/y/nerfmatch/model_train/train_nerfmatch_c2f.py", line 106, in main
train(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 863, in train
model = NeRFMatchMSTrainer(config)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 561, in init
self.model = NeRFMatcherMS(model_conf)
File "/data/users/y/nerfmatch/model_train/nerfmatch/nerfmatch_c2f_trainer.py", line 84, in init
self.backbone = init_backbone_8_2(
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 112, in init_backbone_8_2
backbone = MetaFormer_MS(name, pretrained=pretrained)
File "/data/users/y/nerfmatch/model_train/nerfmatch/modules/init.py", line 28, in init
model = timm.create_model(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_factory.py", line 117, in create_model
model = create_fn(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 1015, in convformer_b36
return _create_metaformer('convformer_b36', pretrained=pretrained, **model_kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/metaformer.py", line 663, in _create_metaformer
model = build_model_with_cfg(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 427, in build_model_with_cfg
load_pretrained(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_builder.py", line 205, in load_pretrained
state_dict = load_state_dict_from_hf(pretrained_loc, weights_only=True)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/timm/models/_hub.py", line 192, in load_state_dict_from_hf
cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_revision)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 862, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 969, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1487, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 953941 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 953940) of binary: /data/users/y/anaconda3/envs/nerfmatch/bin/python
Traceback (most recent call last):
File "/data/users/y/anaconda3/envs/nerfmatch/bin/torchrun", line 33, in
sys.exit(load_entry_point('torch==2.0.1', 'console_scripts', 'torchrun')())
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/data/users/y/anaconda3/envs/nerfmatch/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
model_train/train_nerfmatch_c2f.py FAILED
Failures:
[1]:
time : 2024-10-29_09:13:47
host : hello-PowerEdge-T640
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 953942)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[2]:
time : 2024-10-29_09:13:47
host : hello-PowerEdge-T640
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 953943)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure):
[0]:
time : 2024-10-29_09:13:47
host : hello-PowerEdge-T640
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 953940)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
excuse me
1In this error, there is a remote connection to huggingface. co, is it because the link cannot be reached?
2In the environment configuration, the version in the PyTorch Lightning official documentation does not correspond to the version of Torch
Torch is 2.0, PL version requires 2.0 or above, is this an incorrect reason? Version Corresponding Query https://lightning.ai/docs/pytorch/latest/versioning.html#pytorch -support
如果是问题1的话,请问需要什么方式解决呢?
那问题2需不需要改动呢?如果pl的版本进行升级之后,代码也需要改动,因为替换高版本的pl会出现以下错误
ImportError: cannot import name ‘DDPPlugin‘ from pytorch_lightning.plugins
谢谢
The text was updated successfully, but these errors were encountered: