Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练时数据集下载出现问题,data_dir是不是下载的ie_instruction的路径呢? #27

Open
vv521 opened this issue Sep 7, 2023 · 2 comments

Comments

@vv521
Copy link

vv521 commented Sep 7, 2023

Traceback (most recent call last):
File "src/run_uie.py", line 560, in
main()
File "src/run_uie.py", line 296, in main
raw_datasets = load_dataset(
File "/root/miniconda3/envs/instruct-uie/lib/python3.8/site-packages/datasets/load.py", line 1694, in load_dataset
builder_instance.download_and_prepare(
File "/root/miniconda3/envs/instruct-uie/lib/python3.8/site-packages/datasets/builder.py", line 595, in download_and_prepare
self._download_and_prepare(
File "/root/miniconda3/envs/instruct-uie/lib/python3.8/site-packages/datasets/builder.py", line 683, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/root/miniconda3/envs/instruct-uie/lib/python3.8/site-packages/datasets/builder.py", line 1075, in _prepare_split
for key, record in utils.tqdm(
File "/root/miniconda3/envs/instruct-uie/lib/python3.8/site-packages/tqdm/std.py", line 1182, in iter
for obj in iterable:
File "/root/.cache/huggingface/modules/datasets_modules/datasets/uie_dataset/f3e8d02f5ffb4e66435bbe181a28a4403a3ee701bbd8a110780c5e90beb86581/uie_dataset.py", line 661, in _generate_examples
assert os.path.exists(ds_path)
AssertionError
[2023-09-06 10:52:42,767] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 1069
[2023-09-06 10:52:42,767] [ERROR] [launch.py:324:sigkill_handler] ['/root/miniconda3/envs/instruct-uie/bin/python', '-u', 'src/run_uie.py', '--local_rank=0', '--do_train', '--do_predict', '--predict_with_generate', '--model_name_or_path', '/root/InstructUIE-master/model_cache/t5-base', '--data_dir', '/root/InstructUIE-master/data/IE_INSTRUCTIONS/RE/ADE_corpus/train.json', '--task_config_dir', '/root/InstructUIE-master/configs/multi_task_configs', '--instruction_file', '/root/InstructUIE-master/configs/instruction_config.json', '--instruction_strategy', 'single', '--output_dir', 'output/t5-re-single', '--input_record_file', 'flan-t5.record', '--per_device_train_batch_size', '8', '--per_device_eval_batch_size', '16', '--gradient_accumulation_steps', '8', '--learning_rate', '5e-03', '--num_train_epochs', '5', '--deepspeed', 'configs/ds_configs/stage0.config', '--run_name', 't5-base-mult-mi-experiment', '--max_source_length', '512', '--max_target_length', '50', '--generation_max_length', '50', '--max_num_instances_per_task', '10000', '--max_num_instances_per_eval_task', '200', '--add_task_name', 'False', '--add_dataset_name', 'False', '--num_examples', '0', '--overwrite_output_dir', '--overwrite_cache', '--lr_scheduler_type', 'constant', '--warmup_steps', '0', '--logging_strategy', 'steps', '--logging_steps', '500', '--evaluation_strategy', 'no', '--save_strategy', 'steps', '--save_steps', '2000'] exits with return code = 1

@vv521 vv521 changed the title 训练时数据集下载出现问题,data_dir十倍速 训练时数据集下载出现问题,data_dir是不是下载的ie_instruction的路径呢? Sep 7, 2023
@BeyonderXX
Copy link
Owner

我们的数据集都是提前下好的,这里是预处理到缓存,不存在下载的说法。

@vv521
Copy link
Author

vv521 commented Sep 7, 2023

谢谢您的答疑。另外我还有一个问题想请教一下,就是在训练时,IE_INSTRUCTION这个数据集中只留下RE和NER两个数据集就跑不通,但是留着全部数据集却跑通了,这是为什么呢?期待您的答复

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants