Skip to content
This repository was archived by the owner on Oct 15, 2020. It is now read-only.
This repository was archived by the owner on Oct 15, 2020. It is now read-only.

Add option to specify samples and metafile location #6

Open
@eric-czech

Description

@eric-czech

It would be useful for the read_bgen function to have arguments for custom metafile and sample paths. This should pass down to the bgen_reader.read_bgen arguments for those things.

The pressing need I have for this is that bgen_reader is requiring filesystem properties that aren't supported by gcsfuse. Example:

path = osp.expanduser('~/data/rs-ukb/raw-data/gt-imputation/ukb_imp_chr21_v3.bgen')
read_bgen(path)
# OSError: [Errno 5] Input/output error
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
 in 
----> 1 ds = load(21)
      2 ds

in load(contig)
2 path = osp.join(input_path, f'ukb_imp_chr{contig}_v3.bgen')
3 print(path)
----> 4 ds = read_bgen(path)
5 return ds

~/repos/sgkit-bgen/sgkit_bgen/bgen_reader.py in read_bgen(path, chunks, lock, persist)
175 """
176
--> 177 bgen_reader = BgenReader(path, persist)
178
179 variant_contig, variant_contig_names = encode_array(bgen_reader.contig.compute())

~/repos/sgkit-bgen/sgkit_bgen/bgen_reader.py in init(self, path, persist, dtype)
46 self.path = Path(path)
47
---> 48 self.metafile_filepath = _infer_metafile_filepath(Path(self.path))
49 if not self.metafile_filepath.exists():
50 create_metafile(path, self.metafile_filepath, verbose=False)

~/miniconda3/envs/ukb-analysis/lib/python3.7/site-packages/bgen_reader/_reader.py in _infer_metafile_filepath(bgen_filepath)
148 return BGEN_READER_CACHE_HOME / "metafile" / path_to_filename(metafile)
149 else:
--> 150 if is_file_writable(metafile):
151 return metafile
152

~/miniconda3/envs/ukb-analysis/lib/python3.7/site-packages/bgen_reader/_file.py in is_file_writable(filepath)
41 def is_file_writable(filepath: Path):
42 try:
---> 43 _touch(filepath)
44 except PermissionError:
45 return False

~/miniconda3/envs/ukb-analysis/lib/python3.7/site-packages/bgen_reader/_file.py in _touch(filepath, mode, dir_fd, **kwargs)
86 f.fileno() if os.utime in os.supports_fd else filepath,
87 dir_fd=None if os.supports_fd else dir_fd,
---> 88 **kwargs,
89 )

OSError: [Errno 5] Input/output error

I imagine I can get around this temporarily by having the metafiles written to a local directory instead, but I'm not sure how we should do this in a distributed environment with remote storage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions