Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (image_ids in this case) have excessive nesting (inputs type list where type int is expected). #67

Open
pavankale2709 opened this issue Dec 6, 2023 · 1 comment

Comments

@pavankale2709
Copy link

We are trying to train the model but getting following error, please help in resolution.

'E'], epoch=50, lr=5e-05, bs=2, input_len=512, output_len=512, eval_bs=4, eval_acc=None, train_split='train', val_split='val', test_split='test', use_generate=True, final_eval=False, user_msg='rationale', img_type='vit', eval_le=None, test_le=None, evaluate_dir=None, caption_file='data/instruct_captions.json', use_caption=True, prompt_format='QCM-E', seed=42)

Downloading tokenizer_config.json: 0%| | 0.00/2.50k [00:00<?, ?B/s]
Downloading tokenizer_config.json: 100%|##########| 2.50k/2.50k [00:00<00:00, 332kB/s]

Downloading tokenizer.json: 0%| | 0.00/2.42M [00:00<?, ?B/s]
Downloading tokenizer.json: 100%|##########| 2.42M/2.42M [00:02<00:00, 1.07MB/s]
Downloading tokenizer.json: 100%|##########| 2.42M/2.42M [00:02<00:00, 1.07MB/s]

Downloading (…)cial_tokens_map.json: 0%| | 0.00/2.20k [00:00<?, ?B/s]
Downloading (…)cial_tokens_map.json: 100%|##########| 2.20k/2.20k [00:00<00:00, 441kB/s]

Downloading config.json: 0%| | 0.00/1.53k [00:00<?, ?B/s]
Downloading config.json: 100%|##########| 1.53k/1.53k [00:00<00:00, 382kB/s]

Downloading model.safetensors: 0%| | 0.00/990M [00:00<?, ?B/s]
...
Downloading model.safetensors: 100%|##########| 990M/990M [11:47<00:00, 1.40MB/s]
Some weights of T5ForMultimodalGeneration were not initialized from the model checkpoint at declare-lab/flan-alpaca-base and are newly initialized: ['encoder.image_dense.weight', 'encoder.mha_layer.in_proj_weight', 'encoder.mha_layer.out_proj.weight', 'encoder.image_dense.bias', 'encoder.gate_dense.weight', 'encoder.gate_dense.bias', 'encoder.mha_layer.out_proj.bias', 'encoder.mha_layer.in_proj_bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Downloading generation_config.json: 0%| | 0.00/142 [00:00<?, ?B/s]
Downloading generation_config.json: 100%|##########| 142/142 [00:00<00:00, 15.8kB/s]

0%| | 0/318150 [00:00<?, ?it/s]You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding.
Traceback (most recent call last):
File "C:\Users\pakale\Anaconda3\lib\site-packages\transformers\tokenization_utils_base.py", line 748, in convert_to_tensors
tensor = as_tensor(value)
File "C:\Users\pakale\Anaconda3\lib\site-packages\transformers\tokenization_utils_base.py", line 720, in as_tensor
return torch.tensor(value)
ValueError: expected sequence of length 577 at dim 1 (got 145)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "...\mm-cot-scienceqa\main.py", line 380, in
T5Trainer(
File "...\mm-cot-scienceqa\main.py", line 269, in T5Trainer
trainer.train()
File "C:\Users\pakale\Anaconda3\lib\site-packages\transformers\trainer.py", line 1591, in train
return inner_training_loop(
File "C:\Users\pakale\Anaconda3\lib\site-packages\transformers\trainer.py", line 1870, in _inner_training_loop
for step, inputs in enumerate(epoch_iterator):
File "C:\Users\pakale\Anaconda3\lib\site-packages\accelerate\data_loader.py", line 448, in iter
current_batch = next(dataloader_iter)
File "C:\Users\pakale\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 633, in next
data = self._next_data()
File "C:\Users\pakale\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 677, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "C:\Users\pakale\Anaconda3\lib\site-packages\torch\utils\data_utils\fetch.py", line 54, in fetch
return self.collate_fn(data)
File "C:\Users\pakale\Anaconda3\lib\site-packages\transformers\trainer_utils.py", line 737, in call
return self.data_collator(features)
File "C:\Users\pakale\Anaconda3\lib\site-packages\transformers\data\data_collator.py", line 586, in call
features = self.tokenizer.pad(
File "C:\Users\pakale\Anaconda3\lib\site-packages\transformers\tokenization_utils_base.py", line 3303, in pad
return BatchEncoding(batch_outputs, tensor_type=return_tensors)
File "C:\Users\pakale\Anaconda3\lib\site-packages\transformers\tokenization_utils_base.py", line 223, in init
self.convert_to_tensors(tensor_type=tensor_type, prepend_batch_axis=prepend_batch_axis)
File "C:\Users\pakale\Anaconda3\lib\site-packages\transformers\tokenization_utils_base.py", line 764, in convert_to_tensors
raise ValueError(
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (image_ids in this case) have excessive nesting (inputs type list where type int is expected).

0%| | 0/318150 [00:01<?, ?it/s]

====Input Arguments====
{
"data_root": "data",
"output_dir": "experiments",
"model": "declare-lab/flan-alpaca-base",
"options": [
"A",
"B",
"C",
"D",
"E"
],
"epoch": 50,
"lr": 5e-05,
"bs": 2,
"input_len": 512,
"output_len": 512,
"eval_bs": 4,
"eval_acc": null,
"train_split": "train",
"val_split": "val",
"test_split": "test",
"use_generate": true,
"final_eval": false,
"user_msg": "rationale",
"img_type": "vit",
"eval_le": null,
"test_le": null,
"evaluate_dir": null,
"caption_file": "data/instruct_captions.json",
"use_caption": true,
"prompt_format": "QCM-E",
"seed": 42
}
img_features size: (11208, 577, 768)
number of train problems: 12726

number of val problems: 4241

number of test problems: 4241

[14:38:56] [Model]: Loading declare-lab/flan-alpaca-base... main.py:66

       [Data]: Reading data...                                   main.py:67

experiments/rationale_declare-lab-flan-alpaca-base_vit_QCM-E_lr5e-05_bs0_op512_ep50
model parameters: 251907840

@cooelf
Copy link
Contributor

cooelf commented May 19, 2024

It seems like a padding mismatch issue in the T5 tokenizer. Not sure if it is due to the update of the tokenizer library. Could you have a check with the latest code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants