We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@MhLiao I tried to pretrain a model on SynthText dataset. I followed your training script:
python -m torch.distributed.launch --nproc_per_node=4 tools/train_net.py --config-file configs/pretrain/seg_rec_poly_fuse_feature.yaml
But I always got the OSError after training a few iterations.
OSError: [Errno 24] Too many open files
How to fix this bug? Is there a way to reduce the number of opened files?
The text was updated successfully, but these errors were encountered:
@HsuWanTing @MhLiao hello, have you solved the problem and how? I met the same error. Thanks.
Sorry, something went wrong.
@HsuWanTing 你这个问题可能是分布式训练(多进程)导致的问题。
@samanthawyf I just used the command line ulimit -n 65535 and the error was gone.
ulimit -n 65535
In my situation, I solve this error by inserting the following two lines code in the train_net.py file.
import torch.multiprocessing torch.multiprocessing.set_sharing_strategy('file_system')
No branches or pull requests
@MhLiao I tried to pretrain a model on SynthText dataset.
I followed your training script:
But I always got the OSError after training a few iterations.
How to fix this bug? Is there a way to reduce the number of opened files?
The text was updated successfully, but these errors were encountered: