You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we use the ScienceQA dataset and use CLIP image features, a gradient explosion occurred. Below is my run log.
====Input Arguments====
{
"data_root": "data",
"output_dir": "experiments",
"model": "allenai/unifiedqa-t5-base",
"options": [
"A",
"B",
"C",
"D",
"E"
],
"epoch": 50,
"lr": 5e-05,
"bs": 4,
"input_len": 512,
"output_len": 512,
"eval_bs": 4,
"eval_acc": null,
"train_split": "train",
"val_split": "val",
"test_split": "test",
"use_generate": true,
"final_eval": false,
"user_msg": "rationale",
"img_type": "clip",
"eval_le": null,
"test_le": null,
"evaluate_dir": null,
"caption_file": "data/instruct_captions.json",
"use_caption": true,
"prompt_format": "QCM-E",
"seed": 42
}
img_features size: (11208, 49, 2048)
number of train problems: 12726
number of val problems: 4241
number of test problems: 4241
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565
[14:58:49] [Model]: Loading allenai/unifiedqa-t5-base... main.py:66
[Data]: Reading data... main.py:67
experiments/rationale_allenai-unifiedqa-t5-base_clip_QCM-E_lr5e-05_bs4_op512_ep50
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at allenai/unifiedqa-t5-base and are newly initialized: ['encoder.gate_dense.bias', 'encoder.gate_dense.weight', 'encoder.image_dense.bias', 'encoder.image_dense.weight', 'encoder.mha_layer.in_proj_bias', 'encoder.mha_layer.in_proj_weight', 'encoder.mha_layer.out_proj.bias', 'encoder.mha_layer.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
model parameters: 228019968
{'loss': 29.2632, 'grad_norm': inf, 'learning_rate': 4.984286612193589e-05, 'epoch': 0.16}
{'loss': 29.2109, 'grad_norm': inf, 'learning_rate': 4.968573224387178e-05, 'epoch': 0.31}
1%|▉ | 1106/159100 [26:08<60:19:10, 1.37s/it]{'loss': 29.2953, 'grad_norm': inf, 'learning_rate': 4.952859836580767e-05, 'epoch': 0.47}
The text was updated successfully, but these errors were encountered:
When we use the ScienceQA dataset and use CLIP image features, a gradient explosion occurred. Below is my run log.
====Input Arguments====
{
"data_root": "data",
"output_dir": "experiments",
"model": "allenai/unifiedqa-t5-base",
"options": [
"A",
"B",
"C",
"D",
"E"
],
"epoch": 50,
"lr": 5e-05,
"bs": 4,
"input_len": 512,
"output_len": 512,
"eval_bs": 4,
"eval_acc": null,
"train_split": "train",
"val_split": "val",
"test_split": "test",
"use_generate": true,
"final_eval": false,
"user_msg": "rationale",
"img_type": "clip",
"eval_le": null,
"test_le": null,
"evaluate_dir": null,
"caption_file": "data/instruct_captions.json",
"use_caption": true,
"prompt_format": "QCM-E",
"seed": 42
}
img_features size: (11208, 49, 2048)
number of train problems: 12726
number of val problems: 4241
number of test problems: 4241
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the
legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, setlegacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in huggingface/transformers#24565[14:58:49] [Model]: Loading allenai/unifiedqa-t5-base... main.py:66
experiments/rationale_allenai-unifiedqa-t5-base_clip_QCM-E_lr5e-05_bs4_op512_ep50
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at allenai/unifiedqa-t5-base and are newly initialized: ['encoder.gate_dense.bias', 'encoder.gate_dense.weight', 'encoder.image_dense.bias', 'encoder.image_dense.weight', 'encoder.mha_layer.in_proj_bias', 'encoder.mha_layer.in_proj_weight', 'encoder.mha_layer.out_proj.bias', 'encoder.mha_layer.out_proj.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
model parameters: 228019968
{'loss': 29.2632, 'grad_norm': inf, 'learning_rate': 4.984286612193589e-05, 'epoch': 0.16}
{'loss': 29.2109, 'grad_norm': inf, 'learning_rate': 4.968573224387178e-05, 'epoch': 0.31}
1%|▉ | 1106/159100 [26:08<60:19:10, 1.37s/it]{'loss': 29.2953, 'grad_norm': inf, 'learning_rate': 4.952859836580767e-05, 'epoch': 0.47}
The text was updated successfully, but these errors were encountered: