an error occurs while training tdl #2

cschy · 2022-11-07T05:39:48Z

(eqg) D:\Project\Educational-Question-Generation\tdl>python train.py
Traceback (most recent call last):
File "train.py", line 15, in
from transformers import BertTokenizerFast as BertTokenizer, BertModel, AdamW, get_linear_schedule_with_warmup
File "D:\Anaconda3\envs\eqg\lib\site-packages\transformers_init_.py", line 21, in
from .configuration_albert import ALBERT_PRETRAINED_CONFIG_ARCHIVE_MAP, AlbertConfig
File "D:\Anaconda3\envs\eqg\lib\site-packages\transformers\configuration_albert.py", line 18, in
from .configuration_utils import PretrainedConfig
File "D:\Anaconda3\envs\eqg\lib\site-packages\transformers\configuration_utils.py", line 24, in
from .file_utils import CONFIG_NAME, cached_path, hf_bucket_url, is_remote_url
File "D:\Anaconda3\envs\eqg\lib\site-packages\transformers\file_utils.py", line 35, in
logger = logging.get_logger(name) # pylint: disable=invalid-name
AttributeError: module 'transformers.utils.logging' has no attribute 'get_logger'

cschy · 2022-11-07T06:06:09Z

I find that the logging.py is empty. I don't know why it occurs but it works when I used pip install transformers==3.1.0 instead of cd transformers & pip install .

zhaozj89 · 2022-11-07T06:20:25Z

Thanks for catching this. pip install transformers==3.1.0 may not work because we modify this version's transformers a bit. I update the logging.py file. Let me know if it works.

cschy · 2022-11-07T06:27:47Z

Thanks for catching this. pip install transformers==3.1.0 may not work because we modify this version's transformers a bit. I update the logging.py file. Let me know if it works.

The files under all the folders of transformers are empty (i.e. benchmark, commands, data and utils). Do these other empty files affect the training of the model?

zhaozj89 · 2022-11-07T06:33:14Z

Thanks for catching this. pip install transformers==3.1.0 may not work because we modify this version's transformers a bit. I update the logging.py file. Let me know if it works.

The files under all the folders of transformers are empty. (i.e. benchmark, commands, data and utils)

Sorry for the confusion. It might be my network problem, and I did not check it after pushing. I have updated the transformers folder.

cschy · 2022-11-08T04:40:05Z

Thanks for catching this. pip install transformers==3.1.0 may not work because we modify this version's transformers a bit. I update the logging.py file. Let me know if it works.

The files under all the folders of transformers are empty. (i.e. benchmark, commands, data and utils)

Sorry for the confusion. It might be my network problem, and I did not check it after pushing. I have updated the transformers folder.

Thank you very much! Here's another problem:
Traceback (most recent call last):
File "tdl/train.py", line 334, in
trainer.fit(model, data_module)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/states.py", line 48, in wrapped_fn
result = fn(self, *args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1073, in fit
results = self.accelerator_backend.train(model)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/accelerators/gpu_backend.py", line 51, in train
results = self.trainer.run_pretrain_routine(model)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1224, in run_pretrain_routine
self._run_sanity_check(ref_model, model)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1257, in _run_sanity_check
eval_results = self._evaluate(model, self.val_dataloaders, max_batches, False)
File "/opt/conda/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 305, in _evaluate
for batch_idx, batch in enumerate(dataloader):
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/opt/conda/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "tdl/train.py", line 135, in getitem
encoding = self.tokenizer.encode_plus(
File "/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 2027, in encode_plus
return self._encode_plus(
File "/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 440, in _encode_plus
batched_output = self._batch_encode_plus(
File "/opt/conda/lib/python3.8/site-packages/transformers/tokenization_utils_fast.py", line 372, in _batch_encode_plus
encodings = self._tokenizer.encode(
File "/opt/conda/lib/python3.8/site-packages/tokenizers/implementations/base_tokenizer.py", line 212, in encode
return self._tokenizer.encode(sequence, pair, is_pretokenized, add_special_tokens)
ValueError: TextInputSequence must be str

So what caused this problem?

zhaozj89 · 2022-11-09T07:48:44Z

Sorry for the late reply. Did you use the uploaded transformers or pip install transformers==3.1.0? Literally, this problem is that the tokenizer does not get the correct input. This may be due to the wrong path/format of data or the change of transformers API. As mentioned previously, you need to use the uploaded transformers as we have modified it a bit. If you use it, would you mind installing it in an editable mode, and debugging it a bit?

cschy · 2022-11-09T08:01:06Z

I did use the uploaded transformers. I tried to debug it now. Thank you very much for your help!

Sorry for the late reply. Did you use the uploaded transformers or pip install transformers==3.1.0? Literally, this problem is that the tokenizer does not get the correct input. This may be due to the wrong path/format of data or the change of transformers API. As mentioned previously, you need to use the uploaded transformers as we have modified it a bit. If you use it, would you mind installing it in an editable mode, and debugging it a bit?

I did use the uploaded transformers. I tried to debug it now. Thank you very much for your help!

cschy · 2022-11-10T02:33:38Z

I found that this is because the type of the first parameter passed to the function self.tokenizer.encode_plushas(in tdl/train.py, class FairytaleQADataset, function __getitem__, at line 135) is dict(need str), so I should change the section to section['section'] which is the same problem as I mentioned in the email. Is it because your python version is able to do the conversion implicitly? Further, I got another error:
File "train.py", line 250, in validation_step
self.log("val_loss", loss, prog_bar=True, logger=True)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in getattr
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'FairytaleTDL' object has no attribute 'log'
I will try to upgrade the version of pytorch-lightning to 0.10.0 as the solution referred from here

zhaozj89 · 2022-11-10T14:20:08Z

I used Python3.6 if I remember. Sorry for the confusion. The code may need some debugging to make it work, but I do not expect it has so many problems. You are welcome to post new problems, and I am happy to give useful input as much as I can.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

an error occurs while training tdl #2

an error occurs while training tdl #2

cschy commented Nov 7, 2022

cschy commented Nov 7, 2022

zhaozj89 commented Nov 7, 2022

cschy commented Nov 7, 2022 •

edited

Loading

zhaozj89 commented Nov 7, 2022

cschy commented Nov 8, 2022

zhaozj89 commented Nov 9, 2022

cschy commented Nov 9, 2022

cschy commented Nov 10, 2022 •

edited

Loading

zhaozj89 commented Nov 10, 2022

an error occurs while training tdl #2

an error occurs while training tdl #2

Comments

cschy commented Nov 7, 2022

cschy commented Nov 7, 2022

zhaozj89 commented Nov 7, 2022

cschy commented Nov 7, 2022 • edited Loading

zhaozj89 commented Nov 7, 2022

cschy commented Nov 8, 2022

zhaozj89 commented Nov 9, 2022

cschy commented Nov 9, 2022

cschy commented Nov 10, 2022 • edited Loading

zhaozj89 commented Nov 10, 2022

cschy commented Nov 7, 2022 •

edited

Loading

cschy commented Nov 10, 2022 •

edited

Loading