-
Notifications
You must be signed in to change notification settings - Fork 213
Issues: modelscope/data-juicer
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
运行 python tools/process_data.py --config configs/demo/process.yaml 报错
question
Further information is requested
#602
opened Feb 28, 2025 by
wqdta
RAY模式下程序一直报OOM
question
Further information is requested
#601
opened Feb 28, 2025 by
charonkk
3 tasks done
image_caption_mapper等类似算子使用前怎么处理自己的数据格式
question
Further information is requested
#600
opened Feb 28, 2025 by
Crazy-JY
3 tasks done
搭建好环境后执行python tools/process_data.py --config configs/demo/process.yaml 命令报错
question
Further information is requested
#580
opened Feb 18, 2025 by
ctgushiwei
process_data.py pre-start is too slow 数据处理脚本启动过慢
dj:efficiency
regarding to efficiency issues and enhancements
question
Further information is requested
#578
opened Feb 18, 2025 by
hhhhsc701
3 tasks done
Installation progress could be optimzed. (Cmake error during installation)
enhancement
New feature or request
environment
related to third-party dependency, DJ-pypi, DJ-docker, etc.
#576
opened Feb 14, 2025 by
zhenqincn
2 tasks done
以ray模式启动,当内存不足的时候,会溢写到磁盘吗?
question
Further information is requested
#574
opened Feb 11, 2025 by
javapythonphp
3 tasks done
Support others LLMs & APIs for the OP issues/PRs about some specific OPs
enhancement
New feature or request
generate_qa_from_text_mapper
dj:op
#535
opened Jan 9, 2025 by
yxdyc
2 tasks done
[BUG]: inappropriate arguments for Something isn't working
dj:dist
issues/PRs about distributed data processing
map_batches
in ray mode
bug
#533
opened Jan 8, 2025 by
HYLcool
Can the cleaning statistics be viewed after creating the config file and performing the cleaning?
question
Further information is requested
#499
opened Nov 27, 2024 by
Tendo33
3 tasks done
Guidance on Monitoring Task Execution with Ray Executor in Data Juicer
dj:dist
issues/PRs about distributed data processing
question
Further information is requested
#496
opened Nov 24, 2024 by
Fatima-0SA
3 tasks done
Update of Jupyter Notebooks
bug
Something isn't working
documentation
Improvements or additions to documentation
#476
opened Nov 6, 2024 by
HYLcool
[Bug]: perplexity_filter 算子内存OOM
bug
Something isn't working
#474
opened Nov 5, 2024 by
weiaicunzai
3 tasks done
[Feat]: Unified LLM Calling Management
enhancement
New feature or request
#451
opened Oct 16, 2024 by
drcege
2 tasks done
[Feat]: Automatic Version Matching During Installation
enhancement
New feature or request
#450
opened Oct 16, 2024 by
drcege
2 tasks done
[Feat]: Enhance Unit Test Coverage for Python and CUDA Compatibility
enhancement
New feature or request
#449
opened Oct 16, 2024 by
drcege
2 tasks done
Require fps filter and mapper for videos
dj:op
issues/PRs about some specific OPs
enhancement
New feature or request
#433
opened Sep 23, 2024 by
BeachWang
[Feat] Support explicit issues/PRs about some specific OPs
enhancement
New feature or request
FusedOP
that allows for the configuration and application of multiple operators in smaller, manageable batches
dj:op
#413
opened Sep 2, 2024 by
yxdyc
2 tasks done
Guidance for OP with multiple data fields to be processed
enhancement
New feature or request
#411
opened Sep 2, 2024 by
yxdyc
2 tasks done
[Feat]: Add Ray actor support
dj:dist
issues/PRs about distributed data processing
enhancement
New feature or request
stale-issue
#371
opened Jul 29, 2024 by
drcege
support panda's student captioner model in our captioning mapper
dj:multimodal
issues/PRs about multimodal data processing
dj:op
issues/PRs about some specific OPs
enhancement
New feature or request
stale-issue
#251
opened Mar 14, 2024 by
yxdyc
ProTip!
no:milestone will show everything without a milestone.