Skip to content

[Feature] Row Type Foundation #1370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

XuQianJin-Stars
Copy link
Contributor

This PR implements the foundation layer for complex data types support: #816

Core Abstractions:

  • DataGetters: Abstract interface for data access operations
  • DataSetters: Abstract interface for data setting operations
  • Enhanced InternalRow interface with DataGetters support

Memory Management:

  • AbstractPagedInputView: Abstract base for paged input views
  • ManagedPagedOutputView: Managed paged output view implementation
  • MemorySegmentInputView: Memory segment input view
  • MemorySegmentOutputView: Memory segment output view
  • Enhanced InputView/OutputView interfaces

Serialization Framework:

  • Serializer: Core serialization interface
  • DataInputDeserializer: Data input deserialization
  • DataOutputSerializer: Data output serialization
  • PagedTypeSerializer: Paged type serialization
  • SerializerSingleton: Serializer singleton management

Base Serializers:

  • BooleanSerializer, ByteSerializer, ShortSerializer
  • IntSerializer, LongSerializer, FloatSerializer, DoubleSerializer
  • BinarySerializer, BinaryStringSerializer
  • DecimalSerializer, TimestampLtzSerializer, TimestampNtzSerializer
  • NullableSerializer: Nullable value handling

Utility Classes:

  • VarLengthIntUtils: Variable length integer encoding/decoding
  • InstantiationUtil: Object instantiation utilities
  • InternalRowUtils: Internal row utilities
  • ArrayUtils: Array manipulation utilities
  • Pair: Generic pair implementation
  • MurmurHashUtils: Murmur hash implementation

Type System Extensions:

  • DataTypeChecks: Data type validation utilities
  • DataTypeDefaultVisitor: Default visitor implementation
  • DataTypeVisitor: Visitor pattern for data types
  • VarBinaryType, VarCharType, MultisetType: New type definitions

Test Coverage:

  • Comprehensive test suite for all serializers
  • SerializerTestBase: Base test framework
  • SerializerTestInstance: Test instance management
  • Individual test classes for each serializer

This foundation provides the necessary abstractions and infrastructure for supporting Array, Map, and Row data types in subsequent PRs.

Purpose

Linked issue: close #1369

Brief change log

Tests

API and Format

Documentation

@XuQianJin-Stars XuQianJin-Stars force-pushed the row-type-foundation branch 5 times, most recently from 3d1184e to de3209e Compare July 21, 2025 00:36
This PR implements the foundation layer for complex data types support:

Core Abstractions:
- DataGetters: Abstract interface for data access operations
- DataSetters: Abstract interface for data setting operations
- Enhanced InternalRow interface with DataGetters support

Memory Management:
- AbstractPagedInputView: Abstract base for paged input views
- ManagedPagedOutputView: Managed paged output view implementation
- MemorySegmentInputView: Memory segment input view
- MemorySegmentOutputView: Memory segment output view
- Enhanced InputView/OutputView interfaces

Serialization Framework:
- Serializer: Core serialization interface
- DataInputDeserializer: Data input deserialization
- DataOutputSerializer: Data output serialization
- PagedTypeSerializer: Paged type serialization
- SerializerSingleton: Serializer singleton management

Base Serializers:
- BooleanSerializer, ByteSerializer, ShortSerializer
- IntSerializer, LongSerializer, FloatSerializer, DoubleSerializer
- BinarySerializer, BinaryStringSerializer
- DecimalSerializer, TimestampLtzSerializer, TimestampNtzSerializer
- NullableSerializer: Nullable value handling

Utility Classes:
- VarLengthIntUtils: Variable length integer encoding/decoding
- InstantiationUtil: Object instantiation utilities
- InternalRowUtils: Internal row utilities
- ArrayUtils: Array manipulation utilities
- Pair: Generic pair implementation
- MurmurHashUtils: Murmur hash implementation

Type System Extensions:
- DataTypeChecks: Data type validation utilities
- DataTypeDefaultVisitor: Default visitor implementation
- DataTypeVisitor: Visitor pattern for data types
- VarBinaryType, VarCharType, MultisetType: New type definitions

Test Coverage:
- Comprehensive test suite for all serializers
- SerializerTestBase: Base test framework
- SerializerTestInstance: Test instance management
- Individual test classes for each serializer

This foundation provides the necessary abstractions and infrastructure
for supporting Array, Map, and Row data types in subsequent PRs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Support for Row Type Foundation
1 participant