Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved serializers #42

Open
2 tasks
lojack5 opened this issue Jan 17, 2023 · 0 comments
Open
2 tasks

Improved serializers #42

lojack5 opened this issue Jan 17, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@lojack5
Copy link
Owner

lojack5 commented Jan 17, 2023

These all fall into the same sort of category, but will be needed to handle some more complicated data files. Clumping them all together because most likely implementing for one serializer type will lay the groundwork to easily do it for others.

We need improved versions of various serializers:

  • Some sort of "delayed write" / "coupled" type. For example, in Oblivion records, the size attribute shows up 12 bytes before the data it's describing. If we could either mark the size as delayed, and rewind the stream and write it after, allowing the data handler part to update the size, that's be great. Bonus points if the size could optionally be hidden as an attribute somehow.
    • I'd like this as generic as possible, so char, unicode, and array could grab this size to use / update easily.
  • "Unbounded" length arrays. Basically, unpack until a given number of bytes are unpacked, rather than to a particular number of items. Again, Oblivion motivated. There the subrecords counts aren't recorded. Rather, subrecords just fill up a specified number of bytes.
    • In general this opens up probably 3 things to potentially specify to the array type:
      1. How the count is determined: static length, unpack a value, read an attribute, or just unpack until a given number of bytes is used.
      2. Whether the "given number of bytes to unpack" should be verified or not in cases where the count is determined a different way.
      3. What type of object is stored inside.

For the "read an attribute" method in array, a more general solution would be optimal, something that char and unicode can use as well. This will probably bleed into union deciders as well, since ATM they only allow some limited serializers as decider results, that can't change on the fly very well. Just spitballing here, but maybe something along the lines of:

get_name_length = attrgetter('name_size')

class MyStruct(Structured):
    ...
    name: Annotated[bytes, char(get_name_length)]

Most IDEs/type-checkers / linters throw errors about no expressions in type-hints (makes sense especially with string-ized hints), so this may require pushing more information into Annotated on the user's end. Not ideal. And __class_getitem__ doesn't support keyword only arguments, so dealing with optional arguments there is a pain, especially if we want to support optional ordering, like:

get_length = attrgetter('item_count')
array[Count[get_length], Size[uint32], float32]
array[Count[get_length], float32]
array[float32, Count[get_length], Size[uint32]]
array[float32, Count[get_length], Size[get_size]]
...
@lojack5 lojack5 added the enhancement New feature or request label Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant