Skip to content

Implement Typed Documents and TypeRegistry #282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 65 commits into
base: decaf
Choose a base branch
from
Open

Conversation

jterapin
Copy link
Contributor

@jterapin jterapin commented Mar 6, 2025

Description: Implementation for Typed Documents and TypeRegistry. Currently only supports JSON documents.

It is highly likely that we have to revisit this implementation in the future since typed document/type registry still being evolved.

@jterapin jterapin changed the title Implement Typed Documents and TypeRegistry [WIP] Implement Typed Documents and TypeRegistry Mar 6, 2025
@jterapin jterapin changed the title [WIP] Implement Typed Documents and TypeRegistry Implement Typed Documents and TypeRegistry Apr 15, 2025
Copy link
Contributor

@mullermp mullermp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Looking better. Still have a bunch of comments though .. In general I think we can still simplify and also be less aggressive on validation and instead be more permissive where possible.

# shape = Smithy::Schema::StructureShape.new
# data = Document::Data.new({ "name" => "example" }, shape: shape)
#
module Document
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting document to be a class and have it be the delegator. What was the intention of making another data subclass?

# @param [Hash<String, Shapes::StructureShape>] registry
def initialize(registry = {})
@registry = registry
@shapes_by_type = register_shape_types(registry.values)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to populate this from the code generated side? If we must iterate shapes, we may as well backtrack and populate both maps in one pass. That at least reduces generated code. Otherwise is shapes_by_type even necessary?


# @api private
# @return [Hash<String, Shapes::StructureShape>]
attr_accessor :registry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of these accessors shouldn't exist. Our public methods should hide this detail.

end

def typed_document?(values)
(values.is_a?(Smithy::Schema::Structure) && @type_registry.shape_by_type(values.class)) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this already checked? And wouldn't this always be true if it was a structure, because it would already be registered?

ref.shape.member(name) || find_member_ref_by_names(ref, name)
end

def find_member_ref_by_names(ref, name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems inefficient. Similar to what we do in codecs, for structure and union, you will want to iterate the shape members and not the values, then you can check json name that way. You're doing a loop for every member, so it's n^2 performance.

end
end

def resolve_member_name(member_ref, opts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out the location_name approach in my PR - we should use similar terms. You can easily handle this with || optionality.

describe Serializer do
let(:shapes) do
shapes = SchemaHelper.sample_shapes
shapes['smithy.ruby.tests#Structure']['members']['timestampDateTime'] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if you move these definitions closer to the test (in the actual tests where they are needed) - it's easier to manage tests that way if they are discrete.

Copy link
Contributor

@alextwoods alextwoods left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - its generally looking good.
I think the functionality from the Document::Data class could be moved into Document as a class (unless theres some reason I'm missing). I also understand why the Document serializer and deserializer exist separately and require a type registery - but I think I would lean towards the public interface for serializing/deserializing documents living on the top level class - it could still require a type registry to be provided and could use these classes under the hood to implement it (and they could then be api private).

@mullermp
Copy link
Contributor

mullermp commented May 9, 2025

That's effectively what I was also saying but I agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants