Skip to content

[native_datafusion] [Spark SQL Tests] Plan structure differences cause test failures #3315

@andygrove

Description

@andygrove

Summary

4 Spark SQL tests fail because native_datafusion produces different plan nodes and partitioning info than expected.

Failing Tests

  • ParquetV2Suite: "Fallback Parquet V2 to V1" — expects FileSourceScanExec or CometScanExec in plan, but native_datafusion uses CometNativeScan
  • BroadcastJoinSuite: "broadcast join where streamed side's output partitioning is HashPartitioning" (x2) — UnknownPartitioning(8) instead of PartitioningCollection
  • FileStreamSinkSuite: "self-union, DSv1, read via DataStreamReader API" / "self-union, DSv1, read via table API" — streaming query expects specific plan structure

Root Cause

native_datafusion uses CometNativeScan instead of CometScanExec/FileSourceScanExec and reports UnknownPartitioning instead of preserving the original partitioning information. Tests that inspect plan internals fail.

Related

Discovered in CI for #3307 (enable native_datafusion in auto scan mode).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions