Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Header of zarr3 array with bytes codec without configuration cannot be parsed #8233

Closed
amotta opened this issue Nov 27, 2024 · 0 comments · Fixed by #8282
Closed

Header of zarr3 array with bytes codec without configuration cannot be parsed #8233

amotta opened this issue Nov 27, 2024 · 0 comments · Fixed by #8282
Assignees

Comments

@amotta
Copy link

amotta commented Nov 27, 2024

I have used TensorStore and its zarr3 driver to create and write out a sharded, three-dimensional array of unsigned 8 bit integers. This resulted in the following zarr.json metadata file (shown is the output of cat zarr.json | python -m json.tool)

{
    "chunk_grid": {
        "configuration": {
            "chunk_shape": [
                1024,
                1024,
                1024
            ]
        },
        "name": "regular"
    },
    "chunk_key_encoding": {
        "name": "default"
    },
    "codecs": [
        {
            "configuration": {
                "chunk_shape": [
                    32,
                    32,
                    32
                ],
                "codecs": [
                    {
                        "name": "bytes"
                    }
                ],
                "index_codecs": [
                    {
                        "configuration": {
                            "endian": "little"
                        },
                        "name": "bytes"
                    },
                    {
                        "name": "crc32c"
                    }
                ]
            },
            "name": "sharding_indexed"
        }
    ],
    "data_type": "uint8",
    "fill_value": 0,
    "node_type": "array",
    "shape": [
        5400,
        2000,
        4000
    ],
    "zarr_format": 3
}

Note that the inner "codecs" contains a "bytes" codec without "configuration". This is not currently handled by webKnossos. The dataset with the problematic layer may be imported, but no data can be loaded and errors are reported on the console.

According to the Zarr v3 specification, the "endian" configuration value is optional for byte-sized values. This is already handled by webKnossos. However, it's unclear to me whether this means that the entire "configuration" may be omitted. The specification does seem to imply that the "configuration" is optional:

The codec object may also contain a configuration object which consists of the parameter names and values as defined by the corresponding codec specification.

As a workaround, I have added an empty "configuration" to the problematic bytes codec. This way, reading from the Zarr3 layer works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants