Releases: iterative/datachain
Releases · iterative/datachain
0.3.9
What's Changed
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #354
- increase timeout of datachain tests in CI (Windows) by @mattseddon in #363
- remove LaionMeta model store registration from wds example by @mattseddon in #364
- slight positioning change to deny AI abstractions by @volkfox in #356
- unstructured example - remove misleading install instructions by @mattseddon in #366
- improve datachain subtract by @EdwardLi-coder in #352
- Fixing get_file_signals for custom types by @dtulga in #371
Full Changelog: 0.3.8...0.3.9
0.3.8
What's Changed
- remove blip2 image desc example by @mattseddon in #338
- Convert 'Union[str, Literal[...]]' type to string by @dreadatour in #345
- increase timeout of datachain tests in CI by @mattseddon in #347
- reduce down to a single claude example by @mattseddon in #346
- Add ability to set row size for flushing udf results to database by @dberenbaum in #342
- Revert float64 tests from #13 by @dreadatour in #341
- Fix empty 'save()' as query last statement by @dreadatour in #357
- Remove Catalog.merge_datasets() by @EdwardLi-coder in #350
- Adding Custom Type (De)Serialization to Signal Schema by @dtulga in #348
DataChain.from_hf
by @dberenbaum in #311- Add
device
parameter to convert functions and update usage model de… by @ayasyrev in #351
New Contributors
Full Changelog: 0.3.7...0.3.8
0.3.7
What's Changed
- Convert custom columns types in dataset_select_paginated by @dreadatour in #339
Full Changelog: 0.3.6...0.3.7
0.3.6
What's Changed
- add retry locks to SQLiteDatabaseEngine execute_str by @mattseddon in #333
- Mutate cannot modify existing column by @EdwardLi-coder in #306
- Mutate can rename columns by @srini047 in #312
- Handle carriage return to support progress bar in logs by @amritghimire in #326
New Contributors
Full Changelog: 0.3.5...0.3.6
0.3.5
What's Changed
- Fix: Standardize union behavior between db implementations by @mattseddon in #304
- Adding schema param to
from_records
by @ilongin in #248 - Fix: Support all column types in SignalSchema.from_column_types by @dreadatour in #319
- Fix: use default delimiter to flatten columns by @shcheklein in #330
Deps
- Bump pdfplumber from 0.11.3 to 0.11.4 by @dependabot in #323
- remove nltk pin by @mattseddon in #332
- move msgpack to core dependencies by @mattseddon in #335
Misc
- Use free GitHub Actions workers whenever possible by @0x2b3bfa0 in #276
- fix test_union_different_column_order by @mattseddon in #324
New Contributors
- @0x2b3bfa0 made their first contribution in #276
Full Changelog: 0.3.4...0.3.5
0.3.4
0.3.3
What's Changed
- Optimize table copy and save step by @dreadatour in #278
- add benchmark for running an actual DataChain query by @skshetry in #188
- Adding In-Memory DataChain Option by @dtulga in #283
- remove erroneous skip_if_not_sqlite calls by @mattseddon in #302
- Added generator function to create dataset out of bucket listing by @ilongin in #260
- Move
fashion_product_images
tutorial todatachain-examples
by @mnrozhkov in #307 - Split Studio tests in CI by @dreadatour in #308
- Bump mypy from 1.10.1 to 1.11.1 by @dependabot in #239
- have file_stem accept a full path by @mattseddon in #284
Full Changelog: 0.3.2...0.3.3
0.3.2
What's Changed
- add example script smoke tests by @mattseddon in #199
- test huggingface pipeline example by @mattseddon in #264
- use D replace "DataChain" by @EdwardLi-coder in #235
- fix(to_pandas): handle empty datachain in to_pandas (and show) by @shcheklein in #241
- feat(column): add regexp match by @shcheklein in #224
- handle nan and inf float values by @dberenbaum in #249
- added Colab links to Getting Started by @volkfox in #275
- update readme about Mistral by @EdwardLi-coder in #270
- readme update 2 by @dmpetrov in #267
- fix unstructured-text example by @mattseddon in #277
- Bump pdfplumber from 0.11.1 to 0.11.3 by @dependabot in #282
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #279
- fixing an issue with JSON fields named as Python reserved words and updating README by @volkfox in #287
- readme - json-pairs by @dmpetrov in #288
- remove notebooks that have been moved to datahchain-examples by @mattseddon in #295
- chore: Deserialize only file signals in get_file_signals by @amritghimire in #305
- Implement database default values by @dreadatour in #296
New Contributors
- @EdwardLi-coder made their first contribution in #235
- @dependabot made their first contribution in #282
Full Changelog: 0.3.1...0.3.2
0.3.1
What's Changed
- Fix typo in
filter
method docstings by @mnrozhkov in #250 - Skip if not SQLite Improvements by @dtulga in #254
- Autodetect Studio branch by @dreadatour in #253
- Autodetect Studio branch fix for 'main' branch by @dreadatour in #257
- Removing
metastore
argument fromClient.parse_url()
by @ilongin in #256 - Autodetect Studio branch fix for 'main' branch by @dreadatour in #258
- Parallel UDF optimizations by @dreadatour in #211
Full Changelog: 0.3.0...0.3.1