Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in running train.py #3

Open
cshanjiewu opened this issue Sep 10, 2020 · 4 comments
Open

error in running train.py #3

cshanjiewu opened this issue Sep 10, 2020 · 4 comments

Comments

@cshanjiewu
Copy link

when I run 'python train.py' with default settings and default feature dataset that provided in this Github, there is an error.
Traceback (most recent call last): File "****/avsd/train.py", line 95, in <module> dataset = VisDialDataset(args, ['train']) File "******/avsd/dataloader.py", line 157, in __init__ self._process_history(dtype) File "********/avsd/dataloader.py", line 296, in _process_history = captions[th_id][:max_ques_len + max_ans_len] RuntimeError: The expanded size of the tensor (44) must match the existing size (40) at non-singleton dimension 0. Target sizes: [44]. Tensor sizes: [40]

@cshanjiewu cshanjiewu changed the title error in dataloader.py error in running train.py Sep 10, 2020
@cshanjiewu
Copy link
Author

RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/generic/THCStorage.c:36 /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:325: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [7,0,0], thread: [64,0,0] Assertion srcIndex < srcSelectDimSizefailed. /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/THC/THCTensorIndex.cu:325: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [7,0,0], thread: [65,0,0] AssertionsrcIndex < srcSelectDimSizefailed.

@Jumperkables
Copy link

Hey, not the author of this but i had the same problem. Until we get an update, for now, just restricting the created history tensor with min can work around this:

history[th_id][round_id][:min(40, max_ques_len + max_ans_len)] = captions....

@Karansheth
Copy link

Karansheth commented Oct 2, 2020

Traceback (most recent call last):
File "/content/drive/My Drive/Colab Notebooks/train.py", line 167, in
enc_out = encoder(batch)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/drive/My Drive/Colab Notebooks/encoders/lf.py", line 92, in forward
hist_embed = self.hist_rnn(hist_embed, batch['hist_len'])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/drive/My Drive/Colab Notebooks/utils/dynamic_rnn.py", line 34, in forward
sorted_seq_input, lengths=sorted_len, batch_first=True)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/utils/rnn.py", line 234, in pack_padded_sequence
lengths = torch.as_tensor(lengths, dtype=torch.int64)
RuntimeError: CUDA error: device-side assert triggered
ERROR WHILE RUNNING train.py

@patrick-tssn
Copy link

same question, did you solve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants