Releases: klarna-incubator/mleko
Releases · klarna-incubator/mleko
v4.3.0
v4.2.0
v4.1.0
v4.1.0 (2024-05-18)
✨ Features
- tuning: Add support for enqueuing trials in
OptunaTuner
. (9e0b6b2
) - data splitting: Add support for stratification on multiple features in the
RandomSplitter
. (d745434
) - transformer: Add
metadata
option for theExpressionTransformer
that allows for creation of meta features not tracked in theDataSchema
. (f16ea8b
) - transformer: Add
ExpressionTransformer
for creating features using thevaex
expression system. (c0faf74
)
v4.0.0
v4.0.0 (2024-05-09)
⛔️ BREAKING CHANGES
- exporter: Add
S3Exporter
that implements cached S3 exporting of files from the local disk. (d17b2d2
) - exporter: Add
BaseExporter
andLocalExporter
implementations that support exporting data to disk, along with correspondingPipeline
steps. (6ce13cf
)
✨ Features
- exporter: Add
LocalManifest
support forLocalExporter
which simplifies caching logic and enables S3 manifest translations. (2199ff0
) - exporter: Add support for multiple data export using
LocalExporter
. (ff988b6
) - data source: Add support for reading manifest files from S3 buckets in
S3Ingester
. (9c68a9b
) - pipeline: Add
disable_cache
parameter toPipeline
execution. (da1e31a
)
🐛 Bug Fixes
- data cleaning: Fix newline characters breaking CSV reading using Arrow. (
3a7e594
) - tuning: Delete logging of storage URI to minimize risk of accidentally logging credentials. (
054692d
)
🛠️ Code Refactoring
- data source: Extract shared S3 logic to
utils
which can be then used byS3Exporter
. (97a7974
)
v3.2.0
v3.1.0
v3.0.0
v2.2.0
v2.2.0 (2024-03-22)
✨ Features
- filter: Add
ImblearnResamplingFilter
which is a wrapper forimblearn
over- and under-samplers. (77a3d7d
) - filter: Add
ExpressionFilter
and base class for simple DataFrame filtering usingvaex
expressions. (dc679ff
) - cache: Add
disable_cache
argument to all cached functions to completely bypass all caching functionality. (fbdfc5d
)
📝 Documentation
- Update
CHANGELOG.md
format to include missing categories. (d97b32c
)
v2.1.0
v2.0.0
v2.0.0 (2024-02-07)
⛔️ BREAKING CHANGES
- pipeline: Refactor
PipelineStep
to useTypedDict
for both inputs and outputs. (2eb623c
)
🐛 Bug Fixes
- data cleaning: Rename empty column name to
_empty
to preventvaex
crashes. (da72b75
) - data cleaning: Cast boolean columns to
int8
during cleaning to reduce label encoding needs. (d94f7c9
) - Added reserved keyword column name replacement to prevent evaluation errors from
vaex
. (3969ffd
)
🛠️ Code Refactoring
- Improve error logging messages, and update codebase to new
black
format. (a29ad45
) - cache: Break out cache handler retrieval method. (
aba9e41
)