Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to write_image into a zipStore? #363

Open
BioinfoTongLI opened this issue Mar 21, 2024 · 2 comments
Open

unable to write_image into a zipStore? #363

BioinfoTongLI opened this issue Mar 21, 2024 · 2 comments

Comments

@BioinfoTongLI
Copy link

Hello there,

We've started to play with HCS data and saving the intemedia results as raw zarr is painful to handle.
So we internally want to use this ZipStore strategy for our internal project.
And the code looks like below:

    store = zarr.ZipStore(Path(f"test.zip"), compression=zipfile.ZIP_DEFLATED, mode='w')
    storage_options = {'compressor': Zlib(level=0)}
    root_group = zarr.group(store=store)
    # Pass all metada to the root group
    for k, v in config.items():
        root_group.attrs[k] = v
    write_image(
        image=np.expand_dims(np.vstack(corrected_stack), 0),
        group=root_group,
        scaler=Scaler(),
        axes=[d["name"] for d in config['multiscales'][0]['axes']],
        storage_options=storage_options,
    )

however I couldn't reproduce the working code mentioned in the issue above. And I am seeing this:

  /opt/conda/lib/python3.9/zipfile.py:1514: UserWarning: Duplicate name: '.zattrs'
    return self._open_to_write(zinfo, force_zip64=force_zip64)
  Traceback (most recent call last):
    File "/opt/conda/lib/python3.9/site-packages/zarr/storage.py", line 1421, in __setitem__
      self.map[key] = value
    File "/opt/conda/lib/python3.9/site-packages/fsspec/mapping.py", line 162, in __setitem__
      self.fs.mkdirs(self.fs._parent(key), exist_ok=True)
    File "/opt/conda/lib/python3.9/site-packages/fsspec/spec.py", line 1444, in mkdirs
      return self.makedirs(path, exist_ok=exist_ok)
    File "/opt/conda/lib/python3.9/site-packages/fsspec/implementations/local.py", line 54, in makedirs
      os.makedirs(path, exist_ok=exist_ok)
    File "/opt/conda/lib/python3.9/os.py", line 225, in makedirs
      mkdir(name, mode)
  FileExistsError: [Errno 17] File exists: 'test.zip'
  
  The above exception was the direct cause of the following exception:
  
  Traceback (most recent call last):
    File "/lustre/scratch126/cellgen/team283/tl10/hcs_analysis/bin/BaSiC_transforming.py", line 81, in <module>
      fire.Fire({
    File "/opt/conda/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
      component_trace = _Fire(component, args, parsed_flag_args, context, name)
    File "/opt/conda/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
      component, remaining_args = _CallAndUpdateTrace(
    File "/opt/conda/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
      component = fn(*varargs, **kwargs)
    File "/lustre/scratch126/cellgen/team283/tl10/hcs_analysis/bin/BaSiC_transforming.py", line 70, in main
      write_image(
    File "/opt/conda/lib/python3.9/site-packages/ome_zarr/writer.py", line 477, in write_image
      dask_delayed_jobs = _write_dask_image(
    File "/opt/conda/lib/python3.9/site-packages/ome_zarr/writer.py", line 580, in _write_dask_image
      da.to_zarr(
    File "/opt/conda/lib/python3.9/site-packages/dask/array/core.py", line 3721, in to_zarr
      z = zarr.create(
    File "/opt/conda/lib/python3.9/site-packages/zarr/creation.py", line 178, in create
      init_array(store, shape=shape, chunks=chunks, dtype=dtype, compressor=compressor,
    File "/opt/conda/lib/python3.9/site-packages/zarr/storage.py", line 428, in init_array
      _require_parent_group(path, store=store, chunk_store=chunk_store,
    File "/opt/conda/lib/python3.9/site-packages/zarr/storage.py", line 298, in _require_parent_group
      _init_group_metadata(store, path=p, chunk_store=chunk_store)
    File "/opt/conda/lib/python3.9/site-packages/zarr/storage.py", line 712, in _init_group_metadata
      store[key] = store._metadata_class.encode_group_metadata(meta)  # type: ignore
    File "/opt/conda/lib/python3.9/site-packages/zarr/storage.py", line 1424, in __setitem__
      raise KeyError(key) from e
  KeyError: '.zgroup'

Has anything changed since that issue?
Thanks!

@snibbor
Copy link

snibbor commented Aug 25, 2024

I get the same issue when the image or stack of images is a dask array. For the write_multiscale method, I narrowed down the error to this part of the code:

  for path, data in enumerate(pyramid):
          options = _resolve_storage_options(storage_options, path)
  
          # ensure that the chunk dimensions match the image dimensions
          # (which might have been changed for versions 0.1 or 0.2)
          # if chunks are explicitly set in the storage options
          chunks_opt = options.pop("chunks", chunks)
          # switch to this code in 0.5
          # chunks_opt = options.pop("chunks", None)
          if chunks_opt is not None:
              chunks_opt = _retuple(chunks_opt, data.shape)
  
          if isinstance(data, da.Array):
              if chunks_opt is not None:
                  data = da.array(data).rechunk(chunks=chunks_opt)
                  options["chunks"] = chunks_opt
              da_delayed = da.to_zarr(
                  arr=data,
                  url=group.store,
                  component=str(Path(group.path, str(path))),
                  storage_options=options,
                  compressor=options.get("compressor", zarr.storage.default_compressor),
                  dimension_separator=group._store._dimension_separator,
                  compute=compute,
              )     

For the url parameter of the da.to_zar method, the group.store is a ZipStore and is not supported from my understanding.
https://docs.dask.org/en/stable/generated/dask.array.to_zarr.html
url: Zarr Array or str or MutableMapping

I am new to zarr so I am not sure how ZipStore can be used if the input images are dask arrays. It may be easier to save the OME-Zarr to a directory and then zip it up afterwards. Although this is not ideal.

Were you able to find any other solutions?

@will-moore
Copy link
Member

Thanks for the feedback and apologies for the lack of response to date...

I see that ZipStore is well supported in zarr-python v3 but we'll need some API changes in ome-zarr-py to allow users to have more control over store creation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants