feat(arrow/compute/exprs): support expr.IntervalYearToMonthLiteral in literalToDatum#825
Open
zeroshade wants to merge 1 commit into
Open
feat(arrow/compute/exprs): support expr.IntervalYearToMonthLiteral in literalToDatum#825zeroshade wants to merge 1 commit into
zeroshade wants to merge 1 commit into
Conversation
… literalToDatum (apacheGH-815) literalToDatum handled interval year/month literals only through the *expr.ProtoLiteral path. The substrait-go library also surfaces these as the dedicated expr.IntervalYearToMonthLiteral type (returned by its intervalYearToMonthLiteralFromProto deserializer), which previously fell through to arrow.ErrNotImplemented. Add type-switch cases for both the value form (the canonical form returned by substrait-go's deserializer) and the pointer form (used in parts of substrait-go itself) so callers reaching either form get the same FixedSizeList<int32>[2] extension scalar produced by the existing ProtoLiteral path. Refactor the ProtoLiteral *types.IntervalYearToMonthType branch to share the new intervalYearToMonthDatum helper so the two paths cannot drift. Add an internal test that asserts the new value and pointer paths produce a Datum equal to the one produced by the existing ProtoLiteral baseline. Closes apache#815.
ee46f86 to
8b9590a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale for this change
Closes #815.
arrow/compute/exprs.literalToDatumhandled interval year/month literals only through the*expr.ProtoLiteralpath. The substrait-go library also surfaces these as the dedicatedexpr.IntervalYearToMonthLiteraltype — that is the valueintervalYearToMonthLiteralFromProtoreturns when deserializing protobuf — and any caller reachingliteralToDatumwith this type previously fell through toarrow.ErrNotImplemented.What changes are included in this PR?
expr.IntervalYearToMonthLiteral(value form, the canonical form returned by substrait-go's deserializer) and*expr.IntervalYearToMonthLiteral(pointer form, used in parts of substrait-go itself, e.g.expr/string_test.go:86). Both produce the sameFixedSizeList<int32>[2]extension scalar that the existing*expr.ProtoLiteralpath already produces, so callers reaching either form get a Datum that iscompute.Equalsto the one they would have gotten via the protobuf path.intervalYearToMonthDatum(mem, years, months). The existing*types.IntervalYearToMonthTypebranch inside the*expr.ProtoLiteralcase now calls this helper instead of inlining the builder logic, so the two paths cannot drift.mem memory.Allocatorparameter thatliteralToDatumalready receives, instead of hardcodingmemory.DefaultAllocator(which the original inlined branch did). All three call sites forwardmem.Are these changes tested?
Yes. New
TestLiteralToDatumIntervalYearToMonthinarrow/compute/exprs/exec_internal_test.goconstructs a*expr.ProtoLiteralbaseline and a parallelexpr.IntervalYearToMonthLiteral(value) and*expr.IntervalYearToMonthLiteral(pointer), then asserts the resulting Datums arecompute.Equals. Existingparquet/file/...,parquet/pqarrow/..., andarrow/compute/exprs/...test suites all continue to pass; full repogo build ./...is clean.The test deliberately uses
memory.DefaultAllocatorrather thanmemory.NewCheckedAllocatorwithAssertSize(t, 0).*scalar.Extensiondoes not implementRelease()(seearrow/scalar/scalar.go:397-466), so an extension scalar's underlying storage is never released even when the wrapping Datum is — adding a checked-allocator assertion would chase a pre-existing leak that is unrelated to this issue. (Filing that as a separate follow-up issue is straightforward and out of scope here.)Are there any user-facing changes?
Yes, but purely additive:
expr.IntervalYearToMonthLiteral(or pointer to one) toliteralToDatum,ExecuteScalarExpression, or related entry points used to receivearrow.ErrNotImplemented; they now receive a properly-formed extension Datum identical to what the*expr.ProtoLiteralpath produces.*expr.ProtoLiteralpath produces a byte-identical Datum to before).Out of scope (separate follow-up)
*scalar.Extensiondoes not implementRelease(), so extension scalars (including the ones produced by this code path, the existing*types.IntervalYearToMonthTypepath, the*types.IntervalDayTypepath, andintervalDay()/intervalYear()extension types in general) leak their storage throughcompute.ScalarDatum.Release(). Worth a separate issue.