train problem #41

Kakarot-Li · 2023-02-23T03:13:15Z

Sorry to bother! After I execute "python run.py train bert" ,I've been stack here for a long time(just like the picture) .It growed to 15% and nothing changed. Is that normal?

PS: I've changed batch_size from 512 to 64,unless my GPU can not run...

Kakarot-Li · 2023-02-23T07:31:05Z

I changed the batch_size to 2 and used other ways to solve the "runtimeerror" , it still down at 241/2801 9%. I really don't know why,it drives me crazy.
Could you help me?

ritwikmishra · 2023-04-24T18:54:20Z

What is your GPU specifications?

shantanu778 · 2023-06-09T11:59:10Z

@vdobrovolskii. I have similar issue during train and evaluate.
It's just stuck without showing any error.

vdobrovolskii · 2023-06-09T18:14:05Z

Please, share your GPU specs and also the output of pip freeze

shantanu778 · 2023-06-10T09:05:55Z

@vdobrovolskii
The outcome from pip freeze

certifi @ file:///croot/certifi_1671487769961/work/certifi
charset-normalizer==3.1.0
click==8.1.3
filelock==3.12.0
idna==3.4
importlib-metadata==6.6.0
joblib==1.2.0
jsonlines==3.1.0
numpy==1.21.6
packaging==23.1
Pillow==9.5.0
regex==2023.5.5
requests==2.31.0
sacremoses==0.0.53
sentencepiece==0.1.91
six==1.16.0
tokenizers==0.8.1rc2
toml==0.10.2
torch==1.4.0+cu92
torchvision==0.5.0+cu92
tqdm==4.65.0
transformers==3.2.0
typing_extensions==4.6.2
urllib3==2.0.2
zipp==3.15.0

Also, outcome from Nvidia-smi

shantanu778 · 2023-06-10T09:07:58Z

@vdobrovolskii I am not even able to evaluate here in this machine. Loading Bert model is really slow.

vdobrovolskii · 2023-06-10T09:35:11Z

You won't be able to train the model on your machine without modifying the code...

But for evaluation it should be more than enough, can you show me the exact sequence of steps you're taking? (Commands and outputs)

shantanu778 · 2023-06-11T17:52:00Z

@vdobrovolskii These are the steps I followed starting from data processing.
You can see I am able to run the evaluation for Test set. But the result doesn't seem accurate. Then for Validation set, I don't even able to run it. It's stuck.

shantanu778 · 2023-06-13T11:46:07Z

@vdobrovolskii
Regarding this problem. Can you please tell me the sentencepiece version?

ritwikmishra · 2023-06-16T07:50:25Z

I would recommend you to use print statements to see where actually the code is stuck.

vdobrovolskii · 2023-06-25T15:02:24Z

@shantanu778 I am not exactly sure what is happening there. The problem is I no longer work where I did when this paper was written, so I don't have access to the server where the original environment was hosted. So I am afraid I can't tell you the exact versions of the packages that I had back then.

However, I'm inviting everyone who's got it working to share their own pip freeze to see what the possible mismatches could be.

vdobrovolskii · 2023-06-25T15:05:01Z

On your screenshot I can see that the word-level evaluation is going OK, but something is wrong with predicting the spans. Would you mind taking a look at the data and confirming that the data preparation went well and everything looks normal? I would pay extra attention to the head2span mapping

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train problem #41

train problem #41

Kakarot-Li commented Feb 23, 2023

Kakarot-Li commented Feb 23, 2023

ritwikmishra commented Apr 24, 2023

shantanu778 commented Jun 9, 2023

vdobrovolskii commented Jun 9, 2023

shantanu778 commented Jun 10, 2023

shantanu778 commented Jun 10, 2023

vdobrovolskii commented Jun 10, 2023

shantanu778 commented Jun 11, 2023

shantanu778 commented Jun 13, 2023

ritwikmishra commented Jun 16, 2023

vdobrovolskii commented Jun 25, 2023

vdobrovolskii commented Jun 25, 2023

train problem #41

train problem #41

Comments

Kakarot-Li commented Feb 23, 2023

Kakarot-Li commented Feb 23, 2023

ritwikmishra commented Apr 24, 2023

shantanu778 commented Jun 9, 2023

vdobrovolskii commented Jun 9, 2023

shantanu778 commented Jun 10, 2023

shantanu778 commented Jun 10, 2023

vdobrovolskii commented Jun 10, 2023

shantanu778 commented Jun 11, 2023

shantanu778 commented Jun 13, 2023

ritwikmishra commented Jun 16, 2023

vdobrovolskii commented Jun 25, 2023

vdobrovolskii commented Jun 25, 2023