Skip to content

[Whisper MPS] BFloat16 and Float16 incompatibility on newer macOS versions - MPS datatype errors and segfaults #115

@seyeong-han

Description

@seyeong-han

Environment

  • macOS Version: macOS-15.7.1-arm64-arm-64bit => macOS-26.1-arm64-arm-64bit
  • Architecture: Apple Silicon (arm64)
  • Model: Whisper Tiny (openai/whisper-tiny)
  • Backend: Metal (MPS)
  • ExecuTorch: Latest version with Metal backend support

Issue Summary

After upgrading macOS, Whisper models exported with BFloat16 or Float16 fail to run on MPS backend, while Float32 works correctly. This forces users to use Float32, sacrificing performance and memory efficiency.

Problem 1: BFloat16 - MPS Unsupported Datatype Error

Export command:

optimum-cli export executorch \
    --model "openai/whisper-tiny" \
    --task "automatic-speech-recognition" \
    --recipe "metal" \
    --dtype bfloat16 \
    --output_dir "$ARTIFACT_DIR"

Error

I 00:00:00.991872 executorch:runner.cpp:221] Encoder output shape: [1, 1500, 384]
I 00:00:00.991883 executorch:runner.cpp:225] Encoder first value: 0.148673
/AppleInternal/Library/BuildRoots/4~B_wcugCOyFEmrl3129h8l5wJX874wFxy1jG_pok/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSNDArray/Kernels/MPSNDArrayScaledDotProductAttention.mm:291: failed assertion `Unsupported datatype'

Analysis:

  • Encoder completes successfully with BFloat16
  • Decoder fails during scaled dot product attention operation
  • Newer macOS versions appear to have stricter MPS datatype requirements for MPSNDArrayScaledDotProductAttention

Problem 2: Float16 - Segmentation Fault

Export command:

optimum-cli export executorch \
    --model "openai/whisper-tiny" \
    --task "automatic-speech-recognition" \
    --recipe "metal" \
    --dtype float16 \
    --output_dir "$ARTIFACT_DIR"

Error:

I 00:00:00.225604 executorch:metal_backend.cpp:213] MetalBackend::init - File closed successfully ./metal/whisper/run.sh: line 48: 99306 Segmentation fault: 11

Root Cause:
The runner code in executorch/extension/asr/runner/runner.cpp (lines 180-196) only handles BFloat16 conversion but lacks Float16 conversion logic:

if (preprocessed_features->scalar_type() != expected_dtype) {
    if (expected_dtype == ::executorch::aten::ScalarType::BFloat16) {
        // Only handles BFloat16 conversion
        auto convert_result = ::executorch::extension::llm::convert_to_bfloat16(
            preprocessed_features);
        // ...
    }
    // Missing: No Float16 conversion logic!
}

Workaround

Currently using Float32 as a workaround:

optimum-cli export executorch \
    --model "openai/whisper-tiny" \
    --task "automatic-speech-recognition" \
    --recipe "metal" \
    --dtype float32 \
    --output_dir "$ARTIFACT_DIR"

cc. @manuelcandales

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions