Skip to content

Stitching Directives #2227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Nov 30, 2020
Merged

Stitching Directives #2227

merged 38 commits into from
Nov 30, 2020

Conversation

yaacovCR
Copy link
Collaborator

@yaacovCR yaacovCR commented Nov 17, 2020

See #2022

To do:
[x] 1. create helper functions for parsing mergeArgsExpr values that allow flexible declaration of keys
[x] 2. create helpers that transform the more generic args/key/keyArg arguments proposed by @gmac into the more generic mergeArgsExpr
[x] 3. create helpers that use @base and @computed directives to automatically expand use of a "whole" key into the required selections
[x] 4. create schema transformer for individual subschemas that validates arguments, as outlined below
[x] 5. create subschema transformer that performs same validation and also generates the transformed subschema

@changeset-bot
Copy link

changeset-bot bot commented Nov 17, 2020

🦋 Changeset detected

Latest commit: b79c572

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages
Name Type
@graphql-tools/stitching-directives Patch
@graphql-tools/schema Patch
@graphql-tools/stitch Patch
@graphql-tools/utils Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@theguild-bot
Copy link
Collaborator

theguild-bot commented Nov 17, 2020

The latest changes of this PR are not available as alpha, since there are no linked changesets for this PR.

@yaacovCR
Copy link
Collaborator Author

graphql/graphql-js#2606 (comment)

We can use custom scalars as directive arguments, but no point in building validation into their parse function as directive arguments are never type-checked (!!!), see comment above.

We can just look up the directive arguments on our own and validate them outside graphql. This is a larger deficiency within the current API, as referenced in the comment above, and we should not reinvent our own workaround at this point.

@yaacovCR
Copy link
Collaborator Author

@gmac @Urigo @dotansimha @kamilkisiela @ardatan @tvvignesh

This is ready for a look.

To get validation working from subschema end you need to add to makeExecutableSchema arguments when building subschema. Add the exported typeDefs SDL within typeDefs argument and to pass the stitchingDirectivedValidator to schemaTransform argument.

To use on gateway, pass the stitchingDirectivesTransform to the subschemaConfigTransformer argument within stitchSchemas.

Once this is working correctly, we can add these as default transforms...

May have time to set up a more elaborate example later over the weekend.

But it is ready to start playing around with.

I want to discuss the directive syntax. I implemented a default simple approach and the argsExpr that allows the most complex option. @gmac suggestions in linked discussion is somewhere in between and I am planning on adding in those options as well.

but I want to broaden the discussion to more people to make sure that we come up with the right specification and not too many confusing options...

@yaacovCR
Copy link
Collaborator Author

Also definitely open to improvements within the implementation! So, implementation and specifications are open for discussion, as is everything else in between. ;)

@yaacovCR
Copy link
Collaborator Author

Note that in the examples above by used an argument of type _Key. That was just to save time, as the sub schema can rely on the fact that the gateway will send the correct argument without any need for coercion. This could be specified within the sub schema as a properly typed input object.

@tvvignesh
Copy link
Contributor

tvvignesh commented Nov 20, 2020

@yaacovCR Thanks. I am just trying to get my head around this 😅 Is this still applicable for the latest updates you have made here?: #2022 (reply in thread)

There I see that something like @merge(args: "ids: [[$key.id]]") was used. Should I update that? Also not sure how I should update _Key as you mentioned above in this example.

I did see a test condition having this SDL (has args become argsExpr?):

scalar _Key
type Query {
   _user(key: _Key): [User] @merge(argsExpr: "key: [[$key]]")
}
type User @base(selectionSet: "{ id }") {
   id: ID
   name: String
}

I am trying to get my head around how this will look like for me coming from federation.

# Accounts - Federation

extend type Query {
	me: User
}

type User @key(fields: "id") {
	id: ID!
	username: String!
}


# Accounts - Stitching (As proposed in https://github.com/ardatan/graphql-tools/discussions/2022#discussioncomment-125866)

type Query {
	me: User
	_users(ids: [ID]): [User] @merge(args: "ids: [[$key.id]]")
}

type User {
	id: ID!
	username: String!
}

# Accounts - Latest Type Merging PR (Not sure)

scalar _Key
type Query {
	me: User
	_users(key: _Key): [User] @merge
}
type User @base(selectionSet: "{ id }") {
	id: ID!
  	username: String!
}

@yaacovCR yaacovCR force-pushed the merge-directive-transformer branch 2 times, most recently from 8a7a028 to 15e24e7 Compare November 20, 2020 14:30
@tvvignesh
Copy link
Contributor

@yaacovCR Just saw that you pushed some updates here: 15e24e7#diff-20430c9c75aa9f36d08da2f7bce0ac3e50f1be693465cd085186b981252a8394

I guess that might help. Will try going through it again.

@tvvignesh
Copy link
Contributor

@yaacovCR Had a look at it properly. While this can be good to start with, to be honest I kind of find Federation SDL more explanatory (maybe due to the lack of complicated expressions). I am trying to get a similar experience in stitching (though I understand, this is the first step and I am totally fine with this as a start since these complexity won't be exposed to the clients consuming the schema anyways).

There is only one thing which I feel is kind of complicated for me to digest in the entire SDL:

_productsByUpc(upcs: [String!]!): [Product] @merge(argsExpr: "upcs: [[$key.upc]]")

By seeing this directive: @merge(argsExpr: "upcs: [[$key.upc]]") as a developer/user who is just starting with Type merging, I don't understand what it does. For eg. these are the questions I get when seeing this:

  1. What is argsExpr? While it might make sense for developers who have some context, I guess it might need a name which is more explanatory (or may be I still have to go a bit deeper to understand) - not sure
  2. While I understand there is some sort of expression being used within argsExpr I don't understand (unlike federation), what the intent is of the respective expression. For instance, in federation I understand from directive @key(fields: "upc") that the respective field is the key for the respective Type. I am trying to get some similar understanding here.
  3. And my next question was: What is $key here. I guess you have convered key: ({ upc }) => upc, to this if I am not wrong, but as a person going through the SDL, I don't understand where $key comes from.

@tvvignesh
Copy link
Contributor

@yaacovCR Kindly ignore my suggestions if they are not valid. I guess @gmac or someone who has worked on Type Merging can give a better feedack. But, this was what I felt as a beginner with Type Merging. Thanks again. Will try it out now.

@tvvignesh
Copy link
Contributor

tvvignesh commented Nov 20, 2020

Guys, some honest feedback again here. Kindly excuse me cause I know I am sharing this without helping you guys in any way.

I spent almost a day trying to understand Type Merging (not this PR but the TS implementation itself as documented here: https://www.graphql-tools.com/docs/stitch-type-merging) and I should honestly say that its very difficult for beginners like me to understand what is happening even when it is documented well. Rest of the GraphQL Tools and the guild stack is easy to understand. Leaving this feedback here cause, if this has to actually get some mainstream usage amongst the community, it might need some good quality video tutorial or blogs or something like that since it takes a very different approach to merging schemas unlike normal stitching. I can help, but the problem is I am myself confused with it 😂

After reading the same page multiple times, I have the feeling of understanding it but not understanding it 😅 I hope you got what I mean.

Also, there are a few confusing parts. If you see this, selectionSet has { id } but to a beginner like me, I had to see it a lot of times to understand if it is the id of the Post or the User since both are being merged.

import { stitchSchemas } from '@graphql-tools/stitch';

const gatewaySchema = stitchSchemas({
  subschemas: [
    {
      schema: postsSchema,
      merge: {
        User: {
          fieldName: 'postUserById',
          selectionSet: '{ id }',
          args: (originalObject) => ({ id: originalObject.id }),
        }
      }
    },
    {
      schema: usersSchema,
      merge: {
        User: {
          fieldName: 'userById',
          selectionSet: '{ id }',
          args: (originalObject) => ({ id: originalObject.id }),
        }
      }
    },
  ],
  mergeTypes: true // << optional in v7
});

Also, I feel most of the confusion would be reduced if better naming convention is chosen. For eg. the name originalObject was confusing. Moreover, the keys might need a better name. For instance, fieldName can mean anything to be honest and I had to read this to understand properly: fieldName specifies a root field used to request the local type

The reason it feels probably difficult is cause a lot of things are stacking together:

  1. Confusing naming conventions
  2. No tutorials/guides/examples anywhere in the web to refer to since this is very new and I am not sure how many people use it
  3. Difficulty in understanding how to translate from Federation to Type Merging even though I have tried to write the SDL many times. Some parallel example showing both federation and type merging can help for people who transition from federation to type merging. @yaacovCR - you helped me a lot out here and I am using the tests you wrote to get things going so far, but still somehow it feels like I am not comfortable yet transitioning from Federation to Type Merging - Not sure why.
  4. And finally, difficulty understanding this PR (which is the cause of not understanding the JS implementation)

I don't know if any of these can be changed since it would lead to breaking changes I guess, but still I thought I should let you know. This is the final piece of the puzzle I am yet to solve with GraphQL and the Guild stack. Have handled the rest on my end.

PS: I will try again tomorrow and update this thread if I manage to understand it. And if I do manage to understand this, I will make a tutorial for this right away myself. Thanks.

@gmac
Copy link
Contributor

gmac commented Nov 21, 2020

@tvvignesh thanks for the feedback, that’s all reasonable stuff. A lot of us on stitching started using it many years ago when it was Apollo Stitching, and it was an utter mess; however, that background makes type merging pretty intuitive. I totally hear you that starting with zero background makes stitching pretty opaque, though I don’t think it’s any worse than coming fresh to Federation (which I did earlier this year).

All that is to say, a few points:

  1. I don’t think the issue here is the naming; it’s actually quite good and explicit these days, so the lib probably won’t have those sorts of refractors.

  2. The issue you do clearly articulate here is a failure to communicate the stitching paradigm, or, what it “does”. If we can communicate that effectively, you may find the naming a lot more intelligible. I do have long term ambitions to build a tutorial repo, but the doesn’t help you in the short term. The best thing I can suggest today is that the code examples are executable, therefore you should be able to copy and paste from docs and pretty much run them.

Sorry that’s not super helpful, and your feedback is all very valid, so thanks for sharing.

@tvvignesh
Copy link
Contributor

@gmac Thanks for your response on the same. Actually, I did use stitching before but not microservices with Apollo where it was as simple as doing mergeSchemas and then passing all the GQL Schemas but then I was just stitching schemas, not delegating the operations with delegateToSchema but then later moved to federation with microservices and got everything working there and now, I am back to stitching 😅

I understand what you are saying. @yaacovCR is doing an amazing job with this actively maintaining the lib. I just probably have to get the basics right to work on this. Will try again. Planning to get myself properly first well settled with the JS implementation and once I understand it, I will come and try out the directives as well.

@gmac
Copy link
Contributor

gmac commented Nov 21, 2020

Ah, if you weren’t doing manual delegation before, then you weren’t using all of stitching features. If you have no crossover type associations, then everything in modern stitching is still as simple as it was before, and you really don’t need anything beyond the “combining multiple schemas” page. If you’ve moved up to crossover types in Federation, then yeah, merging features now apply to what youre doing. The good news is though, that if you understand how Federation works under the hood then you may be surprised to realize how fundamentally simpler merging is; albeit, you have to write the config versus federation that sets up a mess of complexity for you to make extremely basic things work. At the end of the day, merging is kind of two step process:

  1. Build self-contained schemas that are independently valid. Foreign keys are just represented as local versions of a remote type with nothing but an id. Each version of a type across services should have one overlapping field that connects them, normally “id”.
  2. For types with the same name in different services, setup queries so that the gateway can fetch each version of the type and combine them. The selectionSet always picks the shared key from the type. That way whichever service provides an instance of the type first (ie, the originalObject), the shared key can be read from it and used to query for additional versions of the type from other services.

@gmac
Copy link
Contributor

gmac commented Nov 22, 2020

@tvvignesh – you inspired me to draw a picture to go with the "basic example" on the Type Merging docs page... does this help explain what's going on...?

stitching-flow

@tvvignesh
Copy link
Contributor

tvvignesh commented Nov 22, 2020

Thanks a ton @gmac This really helps a lot. I used your diagram to jot down the inference I have as I see it. Kindly correct me if I am wrong. A diagram like this with an explainer to the same in the docs can help a lot in the docs.

Inferences (Marked against step nos. as in the diagram):

  1. The user makes the query where he is trying to get the basic user info by userID and also the list of posts made by the user. As we can see the query spans across multiple services with their own subschemas (Users and Posts)

  2. Based on the query, the gateway knows the portion of the query which has to be sent to the user service and the portion of the query which has to be sent to the posts service and hence sends it across.


Questions here:

  • I see the field id also being sent in the query. How does gateway know that it has to send it? Is this the job of selectionSet from the userSchema? To tell the gateway what additional fields to select when querying the user subschema?

  1. Once the query reachers the subservice, it executes the GraphQL query based on the inputs provided in isolation. In this case, it just tries to retrieve the user info from the service and responds back with the user information with the selectionSet field id.

  2. Now according to the requested query as seen in the first step, the field posts belong to the Posts service. So, the gateway tries to use the information to query the post service.


Questions here:

  • How does the gateway know that it has to do the postUserById query to get the information from post service? Is this the role of fieldName? i.e. to connect User and Post

  • Again I see id being queried here along with the posts when sending to post service. Is this the job of selectionSet: '{ id }' from the postsSchema? To say that also select id when the User type is queried from postsSchema? (The example might need a renaming here to use userID and postID instead of instead of id for both just to give a clear idea on what is being selected, queried and so on. Cause in this case, I guess you are still talking about the userID in the selectionSet of postsSchema)

  • Does this query execution happen in parallel or serial with the query sent to user service? Cause, if I understand it right, the post service/subschema does not need anything else except user ID which is already provided in the query itself and thus need not wait for response from user service. If its parallel, it might need a corresponding indication in the diagram.

The field posts is of Post type as contributed by the Post service. And to resolve it, it needs to run a query. fieldName under the User type specified during stitching says that when the field posts is queried from the posts schema, execute the postUserById query and args: (originalObject) => ({ id: originalObject.id }) suggest the args to be passed to the query to execute the query successfully where originalObject is nothing but the User object here - does this suggest that the execution can happen only in series since userID can be retrieved only after user service returns the result, but can't the userID be picked up from the initial query itself rather than having to wait for the user service to respond?


  1. The posts service executes its part of the GraphQL query and returns the response with the selectionSet field id

  2. The final response from both the services is returned to the gateway which merges the response objects returning only the fields as requested in the original query.


Some other questions:

  • What happens to the types returned by mutations? Cause if I understand right, it has to go to a specific service to perform the mutation, get the response in the gateway, and now, the gateway has to resolve it again if the response contains types from other services following similar cycle (yet to think how its going to look like)

The reason I guess federation kind of feels easier for me is cause it uses __resolveReference for resolving the federated types whose only job is to actually resolve the reference without worrying about anything else. So, its also easier to code (for eg. its as simple as doing getPostByID or getUserByID.

But after seeing this diagram, I can get some real clarity into the execution phase of GQL. And as far as the stitching phase goes, I feel that some inline comments at every step to the stitchSchemas example can really be of huge help in understanding.

Sorry for the long text. Just wanted to take you all through my thoughts. Thanks again.

CC: @yaacovCR

@yaacovCR
Copy link
Collaborator Author

The answers to your questions about selectionSet and fieldName are yes.

Access to subschemas is serial in example above as depicted. Although types are not owned by services in our model (as opposed to Federation), root fields (and regular fields) are. So first subservice entry must occur initially separate from rest, but rest can occur in parallel. This i believe is similar to Federation. You can, however add a local root field on the gateway so that the initial entry happens on the gateway with all remaining subservices in parallel. That I believe is unique to our model.

I am not sure why you think on the subservices level easier to code with resolveReference. In our model, each subschema is just a regular GraphQL server.

Thanks for joining the discussion!

@gmac
Copy link
Contributor

gmac commented Nov 22, 2020

@tvvignesh I’m still not entirely happy with this diagram, and will be expanding it with a step in the current 3.5 position that explains how fieldName + originalObject are used to generate the query in step 4.

Then to your question of why originalObject is needed before performing subsequent queries in step 4: that makes the bold assumption that the inputs are the same. The posts service could very well require one or more fields that are not the same as the id in the initial query; the fact that the initial query has enough information to make the second query is just luck.

Then @yaacovCR - I’m surprised that you’re saying merger requests are parallel after the first original object. I was under the impression that it would be serial, and that steps 3.5, 4, and 5 would repeat sequentially for each subservice with originalObject growing each time as the aggregate of mergers... thus, a service could require an aggregate of keys from two previous services (versus federation which can only utilize keys from the origin service)

@yaacovCR
Copy link
Collaborator Author

It should be parallel if no additional inputs are required.

Say you have resolved the user from from an account service and now you want to grab additional details about the user from the user service and post service and billing service and what have you serviced and all you need is the ID that you now have from the account service. Those should all be in parallel. If one of those services requires additional fields from so we're not resolvable from the account service, an additional round will be triggered.

@gmac
Copy link
Contributor

gmac commented Nov 22, 2020

Super smart! I love the algorithm design. Just a crazy idea to the point @tvvignesh raised earlier: could initial query arguments seed original object to potentially advance the process one step ahead, when possible? It’s not uncommon that the IDs will match. I suppose it does conflate an input field with an object field, which are definitely not the same, although it could be a manual option to say initial arguments work as the original object... just thinking out loud.

@yaacovCR
Copy link
Collaborator Author

Yes, as above, should work transparently if you add a resolver on gateway itself that returns the stub.

@gmac
Copy link
Contributor

gmac commented Nov 22, 2020 via email

@tvvignesh
Copy link
Contributor

I am not sure why you think on the subservices level easier to code with resolveReference. In our model, each subschema is just a regular GraphQL server.

True. And thats exactly what I like about stitching. The only reason I feel __resolveReference was easy was cause I did not have to do anything else in case of federation except for adding the federation directives and __resolveReference wherever I need the types to be available on another service - also even writing the function was easy cause it just takes the reference like an ID and then I have to return the resolved type which was probably well explanatory for me. Everything else is handled by federation itself.
But this is true for stitching as well I guess. Its just that unlike federation which abstracts the implementation detail like fieldName selectionSet etc. stitching exposes all these implementation details for the user. That is coding the stitching logic itself which I don't do in federation. Maybe that's why it felt easier.

I’m still not entirely happy with this diagram, and will be expanding it with a step in the current 3.5 position that explains how fieldName + originalObject are used to generate the query in step 4.

@gmac No issues. This is definitely a great start. Diagrams like this definitely help in making people understand. Its just that since there are 2 phases in stitching - The phase where you stitch everything together based on various parameters, and then the execution phase itself - I am not sure if it will be easy for you to represent both in the same diagram.

@yaacovCR While I get your point regarding parallel queries to some extent, is there some way you would recommend to write queries to establish maximum parallelism with stitching? Or is it something which I should not worry about much and rely on the gateway to do the job for me? To be honest, this is not my concern at this point since I am just starting off, but thought I would understand this so that I can reduce the roundtrips made between the services and gateway.

Thanks again.

@gmac
Copy link
Contributor

gmac commented Nov 22, 2020

@tvvignesh — you don’t need to work too hard to maximize parallelism. It mostly just works efficiently these days. The exchange Yaacov and I had above is getting into power-user optimizations. I wouldn’t worry about trying to short-cut request cycles until you know exactly how everything works and you are doing fine performance tuning on your stack.

Regarding “phases”, I’ll also add an expression of that into the diagram as well. I think you may have oversimplified a bit in your description. There’s really three phases: 1) the original object phase where an initial representation is resolved, 2) the merger phase where all merged objects are resolved in parallel based on the original object (this step is repeatable), and 3) the final return phase where the original object and all mergers are returned as the final response.

I appreciate that you’re trying to understand the stack top to bottom, although don’t overlook the value of just building on code samples either. We’re getting into a lot of minutia here that you don’t actually need to know the full internals on to use.

@@ -0,0 +1,32 @@
{
"name": "@graphql-tools/type-merging-directives",
Copy link
Contributor

@gmac gmac Nov 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that “@computed” will also go through the extra package, then? I actually like this from the doc standpoint. The main type merging page can just focus on static config with links over to an SDL page explaining how each static config translates to SDL.

@gmac
Copy link
Contributor

gmac commented Nov 23, 2020

@yaacovCR – is this revision of the diagram accurate to your eye? The idea is that steps 4-7 are a repeatable sequence for each type merger beyond the original object. I this looks sufficiently accurate, I'll write up some supporting narrative and include it in merging docs.

stitching-flow

@tvvignesh – more insightful?

@yaacovCR
Copy link
Collaborator Author

yaacovCR commented Nov 23, 2020 via email

@tvvignesh
Copy link
Contributor

you don’t need to work too hard to maximize parallelism. It mostly just works efficiently these days. The exchange Yaacov and I had above is getting into power-user optimizations. I wouldn’t worry about trying to short-cut request cycles until you know exactly how everything works and you are doing fine performance tuning on your stack.

Sure. Thanks for the clarity.

more insightful?

Definitely. This is even more clear. This diagram definitely clarifies the basics very well. I also tried extending the schema adding some more types like this and tried using the same diagram to validate the same and that works well too:

# User service
type User {
   id: ID!
   email: String!
   contactInfo: [Contact]
}

# Contact service
type Contact {
  phone: String
  address: String
}

# Post service
type Post {
  id: ID!
  message: String!
  author: User!
  comments: [Comment]
}

type Comment {
  commentID: ID!
  comment: String
}

One more thing which I was trying to understand while doing this is how typemerging works with different types in the same microservice vs different types in different microservices. For eg. in the above SDL, both Post and Comment types lie in the same Post microservice but User and Contact types are in different microservices. Does typemerging treat both cases as the same or does it treat them differently?

Thanks.

…th unqualified and qualified keys

note that you can use whatever you want as the initial variable name that begins in this examples as $key
type-merging-directives => stitching-directives
cannot use schema.getImplementations yet
this works for complex keys formatted just like the selection set, the default

todo:
[ ] add keyField argument to pick a simple key out of the selection set
[ ] add key argument to build a custom complex key in any pattern
when computed fields are used, the key is supplemented with additional selection sets. but base is too non-descriptive....
@yaacovCR yaacovCR force-pushed the merge-directive-transformer branch from 7117b46 to 45ef617 Compare November 26, 2020 17:47
@yaacovCR
Copy link
Collaborator Author

yaacovCR commented Nov 26, 2020

Ok, pretty much done.

For integration test, comparison with Federation, see: https://github.com/ardatan/graphql-tools/blob/45ef6175f65e9930038a4e9eb9583e1e96c9bd39/packages/stitch/tests/typeMergingWithDirectives.test.ts

Note the pretty straightforward API with use of keyField and keyArg arguments able to offer quite a great deal of customization. You can use multiple arguments like keyField/key, keyArg, additionalArgs, or you can set up your own single argsExpr in which all the arguments are rolled into one string with new operators.

IMPORTANT: Most users will probably require zero arguments, as they can just rely on the default settings, also demonstrated in that example, or just use the keyField argument for simple keys.

@tvvignesh
Copy link
Contributor

@yaacovCR Just went through it. This is way more cleaner and descriptive than how it started off 👍

I don't see @graphql-tools/stitching-directives in npm. Not published yet right?

@yaacovCR
Copy link
Collaborator Author

@ardatan @kamilkisiela I think canary release failed, did I break something?

@yaacovCR yaacovCR merged commit b48a91b into master Nov 30, 2020
@yaacovCR yaacovCR deleted the merge-directive-transformer branch November 30, 2020 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants