-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Signature normalization, ordering fragments #2282
Conversation
Sort document definitions array first by operation, followed by an alphabetized list of fragments. This change adds one more layer of "determinism" to how we normalize document signatures. This is critical for properly registering operations, as well as comparing incoming operations against the manifest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! Great job with the tests. I left a few NBD comments and some suggestions.
// sortAST sorts most multi-child nodes alphabetically. Using this as part of | ||
// your signature calculation function may make it easier to tell the difference | ||
// between queries that are similar to each other, and if for some reason your | ||
// GraphQL client generates query strings with elements in nondeterministic | ||
// order, it can make sure the queries are treated as identical. | ||
export function sortAST(ast: DocumentNode): DocumentNode { | ||
return visit(ast, { | ||
const [operation] = ast.definitions.filter(isOperationDefinitionNode); | ||
const fragments = ast.definitions.filter(isFragmentDefinitionNode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fear that this is overly specific. Given the definition of DefinitionNode
and ExecutableDefinitionNode
:
export type DefinitionNode =
| ExecutableDefinitionNode
| TypeSystemDefinitionNode
| TypeSystemExtensionNode;
export type ExecutableDefinitionNode =
| OperationDefinitionNode
| FragmentDefinitionNode;
I don't know what TypeSystemDefinitionNode
or TypeSystemExtensionNode
are, but they are legal values in a DefinitionNode
. This also would be susceptible to be other new things added in the future. With that lack of knowledge, I think it would make sense to favor using "fragments and everything else" versus "fragments and operations". Thoughts?
const nonFragments = ast.definitions.filter((node) => ! isFragmentDefinitionNode(node))
const fragments = ast.definitions.filter(isFragmentDefinitionNode);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm operating under the assumption that the document will only have ExecutableDefinitionNodes
else it wouldn't be a valid operation signature. I wouldn't expect the other two TypeSystemXNode
s to turn up in an operation (they're used in SDL) but that may be an incorrect assumption.
Regardless of what I find, I'll add some documentation here to explain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is totally true for our use case. I don't think you can be positive that it covers all use cases for both us and consumers of apollo-graphql
.
The function is called sortAST
and accepts a DocumentNode
; I think you need to account for all legal values that might exist in a DocumentNode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a piece off the above comment: this function is called sortAST
; so it might be used to sort any kid of AST. Restricting this to DocumentNode
s that only include ExecutableDefinitionNodes
doesn't feel right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think giving this a more specific name, like lexicographicSortOperations
(by analogy with lexicographicSortSchema
might be more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filtering out non-ExecutableDefinitionNode
s (as you currently do) might be ok, that is also what execute
does. Alternatively, maybe we could put them all at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support a different name! But I'm pretty okay with not exposing sortAST
as a public API, instead only exposing abstractions that implement it (like defaultEngineReportingSignature
in the adjacent index.ts
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved in 2ad998b
Following @martijnwalraven 's suggestion of parity with the graphql function, I've opted to rename (and effectively repurpose) this: lexicographicSortOperations
.
This is worth discussing - it's a breaking change but this package is new and may not be in use. Regardless, this requires a major version bump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abernix I'm just now seeing your comments as I had a stale page up while responding to comments. I'm open to not exporting this function publicly.
// sortAST sorts most multi-child nodes alphabetically. Using this as part of | ||
// your signature calculation function may make it easier to tell the difference | ||
// between queries that are similar to each other, and if for some reason your | ||
// GraphQL client generates query strings with elements in nondeterministic | ||
// order, it can make sure the queries are treated as identical. | ||
export function sortAST(ast: DocumentNode): DocumentNode { | ||
return visit(ast, { | ||
const [operation] = ast.definitions.filter(isOperationDefinitionNode); | ||
const fragments = ast.definitions.filter(isFragmentDefinitionNode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think giving this a more specific name, like lexicographicSortOperations
(by analogy with lexicographicSortSchema
might be more appropriate.
// sortAST sorts most multi-child nodes alphabetically. Using this as part of | ||
// your signature calculation function may make it easier to tell the difference | ||
// between queries that are similar to each other, and if for some reason your | ||
// GraphQL client generates query strings with elements in nondeterministic | ||
// order, it can make sure the queries are treated as identical. | ||
export function sortAST(ast: DocumentNode): DocumentNode { | ||
return visit(ast, { | ||
const [operation] = ast.definitions.filter(isOperationDefinitionNode); | ||
const fragments = ast.definitions.filter(isFragmentDefinitionNode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filtering out non-ExecutableDefinitionNode
s (as you currently do) might be ok, that is also what execute
does. Alternatively, maybe we could put them all at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for opening this, @trevor-scheer. I think this looks great, but I've left a few comments within which might strengthen it a bit more for other possible use cases and also fully leverage the AST visitor from graphql-js
rather than doing traditional transformations.
We need to carefully pair this with a release of apollo-server-plugin-operation-registry
and apollo
CLI, but I think we can knock both of those out relatively quickly once this lands. We can land this on Monday and cut a release of apollo-graphql
pretty easily, but we need to think about the Apollo Engine portion of this change a bit since we do log these hashes — they are present in existing manifests — and I'd like to try to not invalidate those that already exist. I suspect some documents with fragments are currently executing okay under the right conditions!
// sortAST sorts most multi-child nodes alphabetically. Using this as part of | ||
// your signature calculation function may make it easier to tell the difference | ||
// between queries that are similar to each other, and if for some reason your | ||
// GraphQL client generates query strings with elements in nondeterministic | ||
// order, it can make sure the queries are treated as identical. | ||
export function sortAST(ast: DocumentNode): DocumentNode { | ||
return visit(ast, { | ||
const [operation] = ast.definitions.filter(isOperationDefinitionNode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think we should avoid doing direct filtering passes on the AST to account for Node
s (like FragmentDefinitionNode
s) which may not exist.
Also, while the typing of this sortAST
function currently accepts a DocumentNode
(right now), it should theoretically be able to handle sub-ASTs as well (GraphQL's visit
can actually handle more than just DocumentNode
), if we ever desired to do so.
I think remaining aligned with the original behavior of this sortAST
to be a direct implementation of a visitor may be still possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, totally agree! Are you recommending changes for the input type (DocumentNode
) or just pointing out that we may be able to make this more flexible in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Visitor comment resolved in d341e08.
Leaving this thread open for question above.
// sortAST sorts most multi-child nodes alphabetically. Using this as part of | ||
// your signature calculation function may make it easier to tell the difference | ||
// between queries that are similar to each other, and if for some reason your | ||
// GraphQL client generates query strings with elements in nondeterministic | ||
// order, it can make sure the queries are treated as identical. | ||
export function sortAST(ast: DocumentNode): DocumentNode { | ||
return visit(ast, { | ||
const [operation] = ast.definitions.filter(isOperationDefinitionNode); | ||
const fragments = ast.definitions.filter(isFragmentDefinitionNode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support a different name! But I'm pretty okay with not exposing sortAST
as a public API, instead only exposing abstractions that implement it (like defaultEngineReportingSignature
in the adjacent index.ts
)
|
||
const astWithSortedDefinitions = { | ||
...ast, | ||
definitions: [operation, ...sortBy(fragments, 'name.value')], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that in order to achieve a more reliable normalized signature, we should first sort the definitions
on a DocumentNode
by their kind
property, in a similar way to how we sort SelectionSet
selections
: first by kind
, then name.value
. This would place FragmentNodes
above DocumentNodes
consistently.
I believe the only Node
without a name at that level should be an OperationNode
in which it's optional, except for when there are multiple OperationNode
s.
I tried to quickly piece together what I thought the sorting would be like in the most complex situation, and I think it'd be:
DirectiveDefinition
EnumTypeDefinitionNode
EnumTypeExtensionNode
ExecutableDefinitionNode
FragmentDefinitionNode
InputObjectTypeDefinitionNode
InputObjectTypeExtensionNode
InterfaceTypeDefinitionNode
InterfaceTypeExtensionNode
ObjectTypeDefinitionNode
ObjectTypeExtensionNode
OperationDefinitionNode
ScalarTypeDefinitionNode
ScalarTypeExtensionNode
SchemaDefinitionNode
SchemaExtensionNode
UnionTypeDefinitionNode
UnionTypeExtensionNode
It's worth noting that since we're only concerned with executable operations, based on current server implementations in which the presence of most of those Node
types would cause a validation error, I think the only sorting we'll end up with that matters is a fraction of those Nodes.
definitions: [operation, ...sortBy(fragments, 'name.value')], | ||
}; | ||
|
||
return visit(astWithSortedDefinitions, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return visit(astWithSortedDefinitions, { | |
return visit(ast, { | |
Document( | |
node: DocumentNode | |
): DocumentNode { | |
return { | |
...node, | |
// Use sortBy because 'definitions' is not optional. | |
definitions: sortBy(node.definitions, 'kind', 'name.value'), | |
}; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excellent. See: d341e08
I'd like your take on the iteratees that I've added.
operation
will give a nice grouping ofquery
mutation
subscription
. I don't see this adding any value to the normalization itself, but I imagine it will read much more nicely. Do we expect human eyes on these at any point? I don't know the answer to that.- The stringification for anonymous operations - after reading your explanation above about single anonymous operations only, considering this edge case may be overkill.
output: | ||
'mutation A{user{name...Bar...Baz}}' + | ||
'fragment Bar on User{asd}fragment Baz on User{jkl}', | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are invaluable. Thank you so much for adding these!
I suspect we would see additional value (now or in the future) in a test which had multiple named operations in it. The spec states that if there is a single operation in a document, it need not be named, however if there are 2 or more, each must be named. I think it would be smart of us to try to incorporate tests which test that sorting behavior. Would that be something you'd entertain adding?
Currently, if a document contains multiple operations, the name of the operation for which single execution is desired should be passed in. The graphql/graphql-spec#375 RFC proposes another plan for the future, if you're interested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very helpful articulation of some behavior I was seeing during further implementation today.
I'll gladly add some tests specific to what you've mentioned (if I haven't already - will go back and take a look). I went a little crazy adding tests since I found TDD to be the fastest and best way to implement these changes :woo:!
different equivalent (same functions, but composed differently). This is strictly for consistency - if the composition changes, this should be reflected in the tests. The output of defaultEngineSignature is what we're most concerned with.
Update sorting logic to sort by: 1) node type (operation / fragment) 2) operation type (query / mutation / subscription) 3) operation name 4) a stringified version of the node (covers edge case of anonymous operations)
@abernix going to leave a special comment to remind us both about discussing backcompat. Happy weekend! 🎉 |
Closing this, see PR here for related work: |
The purpose of this PR is to add another level of normalization to documents when we parse and transform them.
What is missing?
Currently, query signature normalization breaks, depending on the order of operations and fragments with respect to each other within a document.
Before beginning work on this, I'd like to confirm that this is problematic in the way that I think it is. Two new test cases demonstrate an identical signature with varying test results.Related issue submitted by @justinanastos: apollographql/apollo-tooling#986
See first commit for example test cases
Implementation is relatively straightforward. We need to sort all
ExecutableDefinitionNode
s within the provided AST in a deterministic manner. This can be accomplished within thevisit
or (thanks @abernix!).TODO: