-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A way to serialize/deserialize ndindex #86
Comments
What form of deserialization do you have in mind. Currently, the |
My starting point is a string (collected as user input), so pickle isn't going to help (also no way to pickle/unpickle on the browser side, which after all is running js and not python). To give you a concrete example, if a user opens a 4D dataset, they would then enter a 2D slab specification like:
which will then get passed to the python backend server via a def parseSlice(sliceStr):
return tuple((slice(*(int(i) if i else None for i in part.strip().split(':'))) if ':' in part else int(part.strip())) for part in sliceStr.split(',')) which you can then use like >>> parseSlice("13, 10:1000, 0, :")
(13, slice(10, 1000, None), 0, slice(None, None, None)) but I can't imagine this is a robust approach. I think I would need an actual formal parser of some flavor. |
I see. Parsing strings is not something I had considered, but it could be added. Probably the best way would be to do something akin to |
yes, this exactly! Now if I could just use So I think a starting point for the parser could be # minimize parsed ops
from numpy import s_
ast.parse(f's_[{sliceStr}]') where Of course then we have to figure out what to do with the resulting abstract syntax tree... |
Yes, exactly. If you look at how ast.literal_eval works, it's pretty simple. https://github.com/python/cpython/blob/17b5be0c0a3f74141014e06a660f1b5ddb002fec/Lib/ast.py#L54. It just recursively walks the ast and checks for known OK nodes. I think we can just copy it, and add Support masks is harder, because usually people construct masks out of the array itself as an expression, like |
For my purposes? No. In fact, for my application it would be best if anything other than a comma separated sequence of literal index or literal slice caused an exception when you try to parse it. But that's just my own narrow-ish use case. In terms of the general purpose |
Here's a simplified parser implementation based on import ast
from numpy import s_
def astParseSlice(sliceStr):
for node in ast.walk(ast.parse(f's_[{sliceStr}]')):
if isinstance(node, ast.ExtSlice):
return (x for x in (iors.value.n if isinstance(iors, ast.Index) else slice(iors.lower.n if iors.lower is not None else None, iors.upper.n if iors.upper is not None else None, iors.step.n if iors.step is not None else None) if isinstance(iors, ast.Slice) else None for iors in ast.iter_child_nodes(node)) if x is not None) Basic idea:
But for a real implementation, I agree that copying and modifying ast.literal_eval makes the most sense. The code for editLooking into it further, my implementation above fails for a myriad of inputs, everything from |
ndindex is presently designed around the idea of "raw" indices, that don't have any knowledge of the array being indexed. So a boolean mask like |
I think we can copy literal_eval and
The function doesn't need to worry about input validation because the ndindex classes already do that. |
Nice, that makes it quite clear how to accomplish this. I'll take a stab at a PR |
Thanks @telamonian! |
Sorry, I got a little distracted by some debugging stuff I promised to add to the "how to develop a jupyter extension" jupytercon tutorial. I'll probably make the PR for this over the weekend Edit: I've started work |
Okay, so I ran into problems with the implementation, and after doing some digging it turns out that the From https://docs.python.org/3/library/ast.html#node-classes:
So, eg the section for handling I can definitely come up with something that will work for 3.7, but making a string->index parser that will work with all of 3.6-3.9 will take some extra work. I'll look into it further |
turns out to have been much simpler than I initially thought to implement a One catch: cpy39 will probably require some more work and explicit handling, since it makes changes to the grammar of @asmeurer Currently the PR contains a |
good news: it turns out the code that I already wrote essentially gets support for the new grammar in cpy39 "for free". ast in cpy39 replaced all uses of @asmeurer I could still use some pointers about how to integrate |
Hi all. This seems like a really neat project, and one I'll probably get a lot of use out of. Question: would a way to deserialize/serialize an ndindex be something that's within the scope of this project?
Here's my use case -> jupyterlab/jupyterlab-hdf5#4. Short version:
h5py
, which also uses numpy-style slices to specify hyperslabsI'd be more than willing to pitch in and help with this in the form of code/PRs, but I'll probably need some help figuring out how to handle deserialization.
The text was updated successfully, but these errors were encountered: