Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate typeid from "the entire struct" instead of just type-name. #139

Open
wc-duck opened this issue Oct 6, 2022 · 9 comments
Open

Comments

@wc-duck
Copy link
Owner

wc-duck commented Oct 6, 2022

Today the typeid of a type is only generated from the types name and nothing else (see https://github.com/wc-duck/datalibrary/blob/master/src/dl_typelib_read_txt.cpp#L1125)

This will lead to problems when, in the future, type upgrades/conversions need to be supported as 2 types would be considered the same even if their members is not the same!

Typeid should be generated from name + member-names + member-types to generate an unique typeid even if the name is the same.

Question: are 2 types the same if name mismatch but the actual struct is equal?

@Tisten
Copy link
Contributor

Tisten commented Oct 6, 2022

Answer: Since dl_reflect_get_type_info() returns the type name among other things it ought to be a conflict to have two "identical" structs with mismatching names. So yes, I think the name should be part of the type hash.

@wc-duck
Copy link
Owner Author

wc-duck commented Oct 9, 2022

One thing here, it might be an idea to add a "version" field to the type, just an integer that can be used to identify different versions of the same type in libs and they could all be stored in the same typelib.
Only generating headers for "the latest" type and always defaulting to the latest when packing would keep the old behavior but still keep types for backwards compat around!

@wc-duck
Copy link
Owner Author

wc-duck commented Oct 9, 2022

.... just need to have a think on how to handle versions for when subtypes upgrade their version!

@wc-duck
Copy link
Owner Author

wc-duck commented Oct 9, 2022

Maybe it is as simple as being able to specify version/revision on members as well?

@Tisten
Copy link
Contributor

Tisten commented Oct 9, 2022

Not sure I follow your thought here. If every type have a version, then that implies that every subtype have a version. So there is no need to have a version per member, since there already is one from the member's type, right? Or what is the purpose of these per-member versions?

Using versions to solve the "data upgrade path" is interesting though, since a "double upgrade" where at least one upgrade step is complex then need to run "both migrations" and in the right order to get correct data. And then each data upgrade might need to be ordered, instead of having a version number per type.

As an example, say we have 2 types:
struct Rot{ float quat[4]; };
struct Entity{ float x; float y; float z; Rot rot; };

Now let's say we want to upgrade these two types by storing rotations as euler, and by keeping the rotation of an entity as quats, but embed the floats into the struct instead of having a substruct, so the new structs are:
struct Rot{ float euler[3]; };
struct Entity{ float x; float y; float z; float qx; float qy; float qz; float qw; };

In this case it will matter which order the data upgrades are made in.

I remember double upgrades being handled by a save system we both worked on in the past, but a little differently than with per type version numbers. And it was also solved in the machinery which I used last, but not at all by manual version numbers there. They had a centralized type schema where all active types were known, and thus could order migrations of the types more explicitly.

@wc-duck
Copy link
Owner Author

wc-duck commented Oct 9, 2022

I'm would guess that you would have to be able to specify what type a member uses with a version as well, reason being.

Typelibs are stored in text and have no notion of what type you use except for name, so if you have a member specified as having type "my_type", if "my_type" is updated there is no data in there on what type you actually mean.

If we only worked with binary typelibs the problem wouldn't be there as the version would be part of the typeid, but not in text.

IMHO a version number is a very low-tech solution... and low-tech is usually easy to get right and reason about. I might be a bit concerned about the ease of "getting it wrong"... but I think that i manageable.

If/when I get the time I'll write up a proposal.

@Tisten
Copy link
Contributor

Tisten commented Oct 9, 2022

I like low tech when it is easy to verify with code. But I don't trust myself, so if you ask me the "set the right number" and I could set it off by one without a compile error or mandatory static code analysis saying that I'm wrong, then I find low tech which rely on humans to be hard to reason about. Anything which is both low tech enough to understand by explanation or reading code, and which uses validation or automation (minimum extra annotation/configuration by humans) gets my vote.

When you say If we only worked with binary typelibs do you then imply that, even with this newly proposed typeid from the entire struct, we could end up with an ambiguous interpretation of how the data was saved just because the type library is declared using text? Or is the version number meant to add "an extra layer of safety" when purely using text based reflection?

I think that "per member version number" could be said to exist in Cap'n'proto, see https://capnproto.org/language.html with the @0 etc. Have you seen anywhere else where this version number design is used sucessfully? Or do you have any experience from other data libraries where the lack of that extra version number have caused practical issues?

@wc-duck
Copy link
Owner Author

wc-duck commented Oct 9, 2022

I see that version is part of the data that typeid is generated from, so data in binary form is always saved with a specific type.

Version here would be really close to what capnproto has, but only for the full struct. The problem I see with that implementation is that you can't change type of a member if I haven't missed anything. Also I don't see how that schema would support removing a member?

But it do lead to less code than what I had in mind.

I think I need to write down how I see all of these separate systems fit together.

@wc-duck
Copy link
Owner Author

wc-duck commented Oct 9, 2022

... ie how I see the workflow for modifying a type and the steps that would involve!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants