Add option to specify samples and metafile location #6
Description
It would be useful for the read_bgen
function to have arguments for custom metafile and sample paths. This should pass down to the bgen_reader.read_bgen arguments for those things.
The pressing need I have for this is that bgen_reader is requiring filesystem properties that aren't supported by gcsfuse. Example:
path = osp.expanduser('~/data/rs-ukb/raw-data/gt-imputation/ukb_imp_chr21_v3.bgen')
read_bgen(path)
# OSError: [Errno 5] Input/output error
--------------------------------------------------------------------------- OSError Traceback (most recent call last) in ----> 1 ds = load(21) 2 dsin load(contig)
2 path = osp.join(input_path, f'ukb_imp_chr{contig}_v3.bgen')
3 print(path)
----> 4 ds = read_bgen(path)
5 return ds~/repos/sgkit-bgen/sgkit_bgen/bgen_reader.py in read_bgen(path, chunks, lock, persist)
175 """
176
--> 177 bgen_reader = BgenReader(path, persist)
178
179 variant_contig, variant_contig_names = encode_array(bgen_reader.contig.compute())~/repos/sgkit-bgen/sgkit_bgen/bgen_reader.py in init(self, path, persist, dtype)
46 self.path = Path(path)
47
---> 48 self.metafile_filepath = _infer_metafile_filepath(Path(self.path))
49 if not self.metafile_filepath.exists():
50 create_metafile(path, self.metafile_filepath, verbose=False)~/miniconda3/envs/ukb-analysis/lib/python3.7/site-packages/bgen_reader/_reader.py in _infer_metafile_filepath(bgen_filepath)
148 return BGEN_READER_CACHE_HOME / "metafile" / path_to_filename(metafile)
149 else:
--> 150 if is_file_writable(metafile):
151 return metafile
152~/miniconda3/envs/ukb-analysis/lib/python3.7/site-packages/bgen_reader/_file.py in is_file_writable(filepath)
41 def is_file_writable(filepath: Path):
42 try:
---> 43 _touch(filepath)
44 except PermissionError:
45 return False~/miniconda3/envs/ukb-analysis/lib/python3.7/site-packages/bgen_reader/_file.py in _touch(filepath, mode, dir_fd, **kwargs)
86 f.fileno() if os.utime in os.supports_fd else filepath,
87 dir_fd=None if os.supports_fd else dir_fd,
---> 88 **kwargs,
89 )OSError: [Errno 5] Input/output error
I imagine I can get around this temporarily by having the metafiles written to a local directory instead, but I'm not sure how we should do this in a distributed environment with remote storage.