Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dealing with absolute paths #23

Open
martinschorb opened this issue May 11, 2020 · 4 comments
Open

dealing with absolute paths #23

martinschorb opened this issue May 11, 2020 · 4 comments

Comments

@martinschorb
Copy link
Contributor

martinschorb commented May 11, 2020

Hi,

what can we do to make pybdv applicable to the following scenario:

  • 2 large datasets in some common storage location (maybe even separate group shares)
  • no write access to those
  • I want to register them to each other
  • I need to create matching bdv xml files pointing to the data but with modified AffineTransform and potentially other attributes.
  • so basically I need some means of creating additional valid BDV xml files without having write access to the data directory.

My idea is to just use write_xml_metadata and have it point to the data container. This however cannot be done using relative paths if the user does not have write access there. So far you have that hardcoded in this function. I will give it a try by finding common path and the use relative directory listing, however, this will fail under Windows when different shares are mounted as different drives...

Any ideas how to solve that? S3 storage for this data?

@martinschorb
Copy link
Contributor Author

OK,

I now tried with relative paths but this does not work.
Any relative path I give to write_xml_metadata reverts to the pure basename.

'../../../../Amira/platy1.h5' reverts to platy1.h5 in the XML.

Bad...

@constantinpape
Copy link
Owner

My idea is to just use write_xml_metadata and have it point to the data container. This however cannot be done using relative paths if the user does not have write access there. So far you have that hardcoded in this function. I will give it a try by finding common path and the use relative directory listing, however, this will fail under Windows when different shares are mounted as different drives...

I think you need to change the data path after you used write_xml_metadata 'by hand' using xmltree.

Something lke this should work (not tested + will overwrite):

def write_data_path(xml_file, new_file_path):
  root = ET.parse(xml_file).getroot()
  node = root.find('SequenceDescription').find('ImageLoader')
  node = node.find('hdf5')
  node.text = new_file_path
  node.attrib['type'] = 'absolute'
  indent_xml(root)  # this is part of pybdv.metadata
  tree = ET.ElementTree(root)
  tree.write(xml_file)

Any ideas how to solve that? S3 storage for this data?

Yes, eventually that sounds like a good option

@martinschorb
Copy link
Contributor Author

what would speak against having the os.path.basename(data_path) in line 248 of write_xml_metadata controlled by a parameter that defines if the input value already is a relative path? That would essentially do the same thing but within pybdv.

In the long run, with large h5/n5 containers located somewhere on the file system or S3 buckets, it will be rather common than the exception to have data and XML in different locations.

@constantinpape
Copy link
Owner

what would speak against having the os.path.basename(data_path) in line 248 of write_xml_metadata controlled by a parameter that defines if the input value already is a relative path? That would essentially do the same thing but within pybdv.

In the long run, with large h5/n5 containers located somewhere on the file system or S3 buckets, it will be rather common than the exception to have data and XML in different locations.

Yes, I agree in the long-term this should be solved a bit differently.
And what you propose sounds alright.
I need to think about how to integrate the S3 capabilities with pybdv a bit more though, so I would not change anything here right now.

So for now, you can just use something like the code-snippet above to fix it and I would revisit this when I have time to think about S3 integration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants