Test Case YAML Specification

Overview

The OpiDiff test framework uses YAML files to define test cases for operators and modules. Each file contains optional global definitions (symbolic dimensions and reusable presets) and a required list of tests. The format is designed to be expressive yet simple: you can describe values, construct modules or call operators, combine them into reusable presets, and generate many tests via templates.

Requirement terms

Fields are marked required or optional. If a field is optional and omitted, a sensible default is used. Unknown keys cause validation errors.

Top‑level structure

A test file is a YAML mapping with the following keys:

include (optional) – names of other YAML files to merge into the current document.
dims (optional) – mapping of symbolic dimension names to integer values.
presets (optional) – mapping of preset names to reusable node definitions.
tests (required) – list of test items.

Only these keys are recognised. Additional keys will cause errors.

Includes

include may be a single file name or a list of file names. The loader reads each file in order and merges their contents into the current document. Merging concatenates tests and combines dims and presets mappings. If two files define the same dimension or preset name, an error is raised. Cyclic includes are detected and rejected.

Dimensions

dims defines named integer constants that can be used in shapes and range specifications. Values must be non‑negative integers. Using an undefined symbol in a shape or range is an error.

Presets

presets allows you to define reusable nodes (value specifications). Each preset is a named node specification (see §Nodes) that can be referenced with a ref node. Presets can themselves reference other presets and dimensions. Names must be unique across the merged configuration.

Tests

The tests array holds the actual test definitions. Each entry must conform to one of the following schemas:

a test case (see §Test cases), or
a pair test generated by a template comparison (§Template compare pairs).

Empty tests lists are allowed but not useful.

Nodes

A node describes a value used as an input, output or argument. Nodes form a tagged union of the following types; each node is a YAML mapping with a type field that chooses the variant. Unless stated otherwise, fields not listed are forbidden and will trigger validation errors.

Reference

A reference node reuses a preset or template variable.

type: ref
ref: <string | var-node>

ref – a string preset name defined in presets, or a template variable reference written explicitly as a var node, whose resolved value must be a preset name.
- {ref: preset_name} → valid (direct preset reference)
- {ref: {var: var_name}} → valid (template variable selecting a preset name)

Tensor

A tensor node specifies a multi‑dimensional torch tensor to be generated.

type: tensor
shape: [d0, d1, …]
kind: <"float"|"int"|"bool">  # or dtype: <torch dtype string>
init: <method>                # optional
low: <range-low>              # optional
high: <range-high>             # optional
mean: <float>                 # optional, default 0.0
std: <float>                  # optional, default 1.0
p: <float>                    # optional, default 0.5
requires_grad: <bool>         # optional, default false

shape (required) – a non‑empty list of integers or dimension symbols. Negative dimensions are not allowed.
kind/dtype (required) – either kind (float, int, or bool) or a dtype string must be provided.
init – controls sampling; supported methods are normal, uniform, randint, zeros, ones and bernoulli.
low and high – inclusive range bounds for numeric sampling; both must be specified together if either is given.
mean, std – mean and standard deviation for normal sampling.
p – probability of 1 when init is bernoulli.
requires_grad – whether to set .requires_grad on the resulting tensor.

Note: Template variable nodes ({var: ...}) are not permitted inside shape.

Scalar

A scalar node represents a Python scalar value.

type: scalar
kind: <"float"|"int"|"bool">  # or dtype: <string>
value: <number|bool>               # optional
low: <range-low>                   # optional
high: <range-high>                 # optional
p: <float>                         # optional when kind is bool

value – if provided, uses this literal value.
low and high – both required when sampling; low < high for numeric kinds.
p – probability of true for boolean scalars; must be between 0 and 1.

Export note: during export, scalar values are converted to tensors and normalized to rank-1 shape (1,). This applies to scalar nodes used in both positional inputs and kwargs. Some operators expect a true Python scalar (or rank-0 tensor) and may not accept a rank-1 tensor for scalar parameters; in those cases prefer const (for literal scalars) or scalar_tensor (for tensor-valued scalar parameters).

Scalar‑tensor

Same as a scalar node but returns a rank‑0 torch tensor.

When to use: use scalar_tensor when a backend requires tensor inputs for scalar-like parameters (common for CoreML). Unlike scalar, which is normalized to rank-1 (1,) during export, scalar_tensor produces a true rank-0 tensor () that many backends handle more reliably for scalar-like inputs.

type: scalar_tensor
kind/dtype: …
value/init/low/high/mean/std/p: …  # same as for tensor

All sampling parameters mirror those of the tensor node.

Int list

A static list of integers.

type: int_list
elems: [i0, i1, …]

The list must be non‑empty.

List

A homogeneous list with a specified length.

type: list
len: <int or symbol>
elem: <node>

len – a non‑negative integer or dimension symbol; defines how many elements to generate.
elem – node describing each element.

Tuple

A fixed‑length tuple of heterogenous nodes.

type: tuple
elems: [node0, node1, …]

Used to describe multi‑output operators or modules.

Optional

Represents an optional value.

type: optional
p_none: <float>  # optional, default 0.0
elem: <node>

With probability p_none, the value is null; otherwise the value is generated from elem. p_none must lie in [0,1].

Constant

A literal value passed unchanged to the operator or module.

type: const
value: <JSON-value>

value can be a number, string, boolean, list, or mapping.

Constant tensor

A tensor with a fixed shape whose contents may be provided explicitly.

type: const_tensor
shape: [d0, …]
kind/dtype: …
value: <JSON-value>           # optional
init/low/high/mean/std/p: …   # optional

When value is present, it must match the declared shape. Otherwise the tensor is sampled using the given parameters.

Variable

Template variable reference.

type: var
name: <string>

Used inside template definitions. The name must be a non‑empty string.

Union of nodes

Nodes may be nested arbitrarily: lists of tuples, optionals inside lists, etc. A node can appear wherever a value is expected: in presets, in a test’s in, kwargs or out, in constructor arguments and keyword arguments, or within template definitions.

Module and constructor specifications

When the op field is not a simple operator name, it can be a structured node describing how to construct a module or arbitrary object.

Module

Constructs a module and calls its forward method.

type: module
path: <import-path | file-path>
args: [arg0, …]      # optional
kwargs: {kw0: val0, …}  # optional

path – class to construct. Supported formats:
- Import path (fully qualified name), e.g.:
  - torch.nn.Linear
- File path (load from a Python source file), e.g.:
  - file:examples/toy_wrappers.py::ToyLogitsLN The file: form loads the module from the given .py file and looks up the attribute after ::.
args – positional constructor arguments; may contain nodes or literals.
kwargs – keyword constructor arguments.

After construction, the module is invoked with the test’s in and kwargs inputs.

Construct

Constructs an arbitrary object for use as an argument or keyword value.

type: construct
path: <import-path | file-path>
args: [ … ]        # required
kwargs: { … }      # optional

Nested construct nodes can be used to build complex argument structures. The resulting object is not called.

Test cases

A test case describes how to call an operator or module and optionally how to interpret its output.

tests:
  - id: <string>             # optional
    impl: <string>           # optional
    op: <operator name | module/template_module/template_compare_pair node>   # required
    in: [input0, …]          # required
    kwargs: {kw: node, …}    # optional
    out: <node>              # optional
    device: <device>         # optional, defaults to cpu
    cast_input0_to_complex: <bool>  # optional, default false

id – user‑chosen identifier. If omitted, the loader generates one during template expansion.
impl – names the implementation to use (rarely needed for simple operator tests). For template compare pairs the side definitions specify their own impl.
op – what to run: a string naming an ATen operator, a module node, a template_module or a template_compare_pair.
in – list of positional inputs. Each entry can be any node type.
kwargs – mapping of keyword argument names to nodes. Each value is generated and passed as a keyword argument.
out – node describing the expected output(s). The field is not validated currently.
device – target device (cpu, gpu, cuda or mps). Defaults to cpu.
cast_input0_to_complex – backend-specific flag. When true, the backend reconstructs a complex tensor for the first input from a packed real/imag representation (a real tensor with trailing dimension [..., 2]) before invoking the operator. This is commonly needed for FFT-family operators on backends that transport complex tensors as real/imag pairs. It applies only to input0.

Unknown keys cause validation errors and may produce helpful suggestions.

Complex tensor handling (backend note)

Some backends (notably CoreML) do not reliably support complex-typed tensors as model inputs. To improve compatibility, the framework may transport complex tensors in a packed real/imag representation: a real tensor with an extra trailing dimension of size 2 ([..., 2]) corresponding to real and imaginary parts.

For operators that semantically require complex inputs (e.g., FFT-family ops), the backend may need to reconstruct a complex tensor from this packed form before invoking the operator. This reconstruction is controlled by backend-specific mechanisms such as cast_input0_to_complex (see the Test case field description).

Examples

Simple operator test

tests:
  - id: add_vectors
    op: aten::add
    in:
      - {type: tensor, shape: [4], dtype: float32, init: normal}
      - {type: tensor, shape: [4], dtype: float32, init: normal}

Module test

presets:
  x:
    type: tensor
    shape: [2, 16]
    dtype: float32
    init: normal

tests:
  - id: linear_forward
    op:
      type: module
      path: torch.nn.Linear
      args: [16, 8]
    in:
      - {ref: x}

Template modules

A template module generates multiple tests by varying constructor parameters.

type: template_module
path: <python.module.ClassName>
vars: { var_name: [value0, value1, …], … }
cases: [ {var_name: value, …}, … ]    # optional
args: [ … ]        # optional
kwargs: { … }      # optional

vars – declares variables and their possible values. The loader forms the Cartesian product of all variables unless cases is provided.
cases – explicit list of variable assignments. When present it overrides the Cartesian product; each mapping must mention only declared variables.
args, kwargs – constructor arguments for the module. Within these you may use var nodes to insert the current variable value.
path – class to construct, as in a module node.

Note (vars + dims): a template variable value may be a string that matches a key in dims. When substituted via {var: ...}, such a value is resolved to the corresponding integer from dims. This is useful for module constructor arguments (e.g., vars: {feature: [N]}) even though tensor shapes cannot use {var: ...}.

In YAML, dimension symbols must be written as bare scalars (e.g. N, D), not quoted (e.g. "N"). Therefore, template variables should be defined using bare symbols:

vars:
  feature: [N]        # valid
vars:
  feature: ["N"]      # invalid

During expansion the loader substitutes each combination of variables into path, args, kwargs, the test id and any ref/var nodes in in and kwargs. The template_module node then becomes a normal module node.

Example

dims:
  B: 2
  N: 4
tests:
  - id: mod_linear_template
    op:
      type: template_module
      path: torch.nn.Linear
      vars:
        in_features: [N]
        out_features: [3, 5, 7]
        bias: [true, false]
      args: [{var: in_features}, {var: out_features}]
      kwargs: {bias: {var: bias}}
    in:
      - {type: tensor, shape: [B, N], dtype: float32, init: normal}

Example using cases

dims:
  B: 2

presets:
  x_feat4: {type: tensor, shape: [B, 4], dtype: float32, init: normal}
  x_feat8: {type: tensor, shape: [B, 8], dtype: float32, init: normal}

tests:
  - id: mod_linear_cases_match_input_features
    op:
      type: template_module
      path: torch.nn.Linear
      vars:
        in_features: [4, 8]
        input_preset: [x_feat4, x_feat8]
        out_features: [8, 16, 32]        
        bias: [true, false]
      cases:
        - {in_features: 4, input_preset: x_feat4}
        - {in_features: 8, input_preset: x_feat8}
      args: [{var: in_features}, {var: out_features}]
      kwargs: {bias: {var: bias}}
    in:
      - {ref: {var: input_preset}}

Template compare pairs

A template compare pair defines two module implementations to run under the same inputs, facilitating side‑by‑side comparisons on the same backend(s).

type: template_compare_pair
vars: { var_name: [value0, …], … }
cases: [ {var_name: value, …}, … ]    # optional
common:
  args: [ … ]        # optional
  kwargs: { … }      # optional
a:
  impl: <string>
  type: module
  path: <python.module.ClassName>
  args: [ … ]        # optional
  kwargs: { … }      # optional
b:
  impl: <string>
  type: module
  path: <python.module.ClassName>
  args: [ … ]        # optional
  kwargs: { … }      # optional

vars and cases work like those in template_module.
common – default constructor arguments applied to both a and b when not overridden.
a, b – compare sides. Each must have an impl (implementation label) and a module definition (with optional args and kwargs).

During expansion the loader produces two tests for each variable assignment: one for side a and one for side b. The id of each test is suffixed with __a or __b and the variable assignment, and the two are grouped into a pair test that the runner uses to compare outputs.

Example

dims:
  N: 4

presets:
  x:
    type: tensor
    shape: [N, N]
    dtype: float32
    init: normal

tests:
  - id: linear_vs_linear
    op:
      type: template_compare_pair
      vars:
        in_features: [N]
        out_features: [8, 16]
      a:
        impl: impl1
        type: module
        path: torch.nn.Linear
        args: [{var: in_features}, {var: out_features}]
      b:
        impl: impl2
        type: module
        path: file:my_model.py::MyLinear
        args: [{var: in_features}, {var: out_features}]
    in:
      - {ref: x}

This generates two pair tests, one for each out_features value. Each pair contains two test cases (a and b) that build the same linear layer but tag them with different implementations (impl1 vs impl2). The framework runs both with the same input and compares their outputs.

Range and symbol resolution

Numeric fields such as len, low, high, mean, std, p, p_none and dimension sizes may be integers or strings. If a string matches a key in dims, the corresponding integer value is used. Numeric strings (e.g. "10") are converted to numbers. Unknown symbols produce errors. Values can be negative.

Dimension symbol resolution is performed in tensor shapes, range/list numeric fields (e.g. len, low, high, …), and when inserting template variables via {var: ...}.

Literal strings in arbitrary module/construct args/kwargs are not dim-resolved. To pass a dimension value into a constructor, route it through vars and use {var: ...}.

Error handling and validation

The loader validates the YAML file thoroughly:

Unknown fields or misspellings cause descriptive errors.
Referencing an unknown preset or variable is an error.
Duplicate definitions across included files are rejected.
Shapes must be non‑empty and list lengths non‑negative.
Range parameters must define valid intervals.
Template cases may only reference declared variables.
Dimension values must be non‑negative integers.

Error messages include the test id and operator path to aid debugging.

Summary

This specification defines a declarative YAML format for writing input–output tests over PyTorch operators and modules. By combining symbolic dimensions, reusable presets, rich node types and template expansion, you can concisely generate large suites of tests. The strict validation rules catch mistakes early. Use the provided examples as patterns for constructing your own tests, and consult the implementation for further details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Case YAML Specification

Overview

Requirement terms

Top‑level structure

Includes

Dimensions

Presets

Tests

Nodes

Reference

Tensor

Scalar

Scalar‑tensor

Int list

List

Tuple

Optional

Constant

Constant tensor

Variable

Union of nodes

Module and constructor specifications

Module

Construct

Test cases

Complex tensor handling (backend note)

Examples

Template modules

Template compare pairs

Range and symbol resolution

Error handling and validation

Summary

FilesExpand file tree

testcase_spec.md

Latest commit

History

testcase_spec.md

File metadata and controls

Test Case YAML Specification

Overview

Requirement terms

Top‑level structure

Includes

Dimensions

Presets

Tests

Nodes

Reference

Tensor

Scalar

Scalar‑tensor

Int list

List

Tuple

Optional

Constant

Constant tensor

Variable

Union of nodes

Module and constructor specifications

Module

Construct

Test cases

Complex tensor handling (backend note)

Examples

Template modules

Template compare pairs

Range and symbol resolution

Error handling and validation

Summary