-
Notifications
You must be signed in to change notification settings - Fork 108
Issues: IBM/data-prep-kit
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Feature] Update RAG examples for release 0.2.1
enhancement
New feature or request
#636
opened Sep 27, 2024 by
sujee
2 tasks done
Create the Ray version of the html2parquet transform
current-priority
enhancement
New feature or request
#635
opened Sep 27, 2024 by
shahrokhDaijavad
1 of 2 tasks
[Feature] Extend DPK capabilities to cover the full life cycle for data acquisition and data processing
enhancement
New feature or request
#633
opened Sep 27, 2024 by
touma-I
2 tasks done
[Feature] separate examples into another repo
enhancement
New feature or request
#628
opened Sep 25, 2024 by
sujee
1 of 2 tasks
[Bug] Fail to build transform spark image when using a release tag x.y.z (without any .devN suffix)
bug
Something isn't working
#625
opened Sep 25, 2024 by
touma-I
1 of 2 tasks
[Feature] Add nodes affinity / Toleration to KFP/Ray nodes
enhancement
New feature or request
#620
opened Sep 24, 2024 by
roytman
2 tasks done
Support for Python 3.12
current-priority
enhancement
New feature or request
#618
opened Sep 24, 2024 by
shahrokhDaijavad
1 of 2 tasks
[Feature] Capability to specify the paths where multiple output tables will be saved
enhancement
New feature or request
#609
opened Sep 21, 2024 by
cmadam
2 tasks done
[Feature] Capability to distribute during initialization to a large binary object (e.g.a table) to all the transform instances
enhancement
New feature or request
#608
opened Sep 21, 2024 by
cmadam
2 tasks done
[Bug] header_cleanser intermittently failing ci/cd when building python venv
bug
Something isn't working
#607
opened Sep 20, 2024 by
daw3rd
1 of 2 tasks
[Feature] Base spark image build is very slow and impacting ci/cd
enhancement
New feature or request
#606
opened Sep 20, 2024 by
daw3rd
1 of 2 tasks
[Bug] pdf2parquet must calculate hash and size on the file
bug
Something isn't working
#605
opened Sep 20, 2024 by
sujee
1 of 2 tasks
[Feature] Enable pure python transforms in new spark runtime.
enhancement
New feature or request
#586
opened Sep 12, 2024 by
daw3rd
1 of 17 tasks
[Bug] Testing Rag notebook with latest release of pdf2Parquet, eDedup and DocID
bug
Something isn't working
#583
opened Sep 10, 2024 by
touma-I
1 of 2 tasks
[Bug] issues running ray transformations on Google colab
bug
Something isn't working
#582
opened Sep 10, 2024 by
sujee
1 of 2 tasks
[Feature] Need better documentation of fuzzy dedupe
enhancement
New feature or request
#578
opened Sep 6, 2024 by
sujee
2 tasks done
[Feature] need an example of using doc_quality plugin with installed pypi packages
enhancement
New feature or request
#575
opened Sep 6, 2024 by
sujee
1 of 2 tasks
[Bug] Intermittent doc_id test-src failures in ci/cd.
bug
Something isn't working
#574
opened Sep 5, 2024 by
daw3rd
2 tasks done
[Bug] improve performance of pdf2parquet
enhancement
New feature or request
#573
opened Sep 5, 2024 by
sujee
1 of 2 tasks
[Bug] test/publish-image targets are disabled for pii_redactor/ray due to OSError
bug
Something isn't working
#571
opened Sep 4, 2024 by
daw3rd
1 of 2 tasks
[Feature] Remove or merge older examples from examples/notebooks/archive
enhancement
New feature or request
#568
opened Sep 4, 2024 by
daw3rd
2 tasks done
[Feature] Allow selected columns to be ignored in non-launcher tests of transforms that generate parquet files.
enhancement
New feature or request
#564
opened Sep 3, 2024 by
daw3rd
2 tasks done
[Feature] HTML to Markdown (based on HTML2Parquet trafilatura code)
enhancement
New feature or request
#559
opened Aug 30, 2024 by
touma-I
2 tasks done
[Bug] header_cleanser fails in running in openshift
bug
Something isn't working
#557
opened Aug 30, 2024 by
dtsuzuku-ibm
1 of 2 tasks
[Feature] Publish data-prep-kit core and transforms NIGHTLY into pypi
enhancement
New feature or request
#554
opened Aug 29, 2024 by
sujee
1 of 2 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.