-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SwiftBindings] Binding process doc #2779
base: feature/swift-bindings
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice high-level description!
src/docs/process-binding.md
Outdated
|
||
1. Generate and consume the abi.json file | ||
2. Extract and demangle the symbols from the binary | ||
3. Iterate over every type and function and gnerate C# and/or supporting Swift code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3. Iterate over every type and function and gnerate C# and/or supporting Swift code | |
3. Iterate over every type and function and generate C# and/or supporting Swift code |
src/docs/process-binding.md
Outdated
1. Generate and consume the abi.json file | ||
2. Extract and demangle the symbols from the binary | ||
3. Iterate over every type and function and gnerate C# and/or supporting Swift code | ||
4. Generate type datatbase entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4. Generate type datatbase entries | |
4. Generate type database entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you envision the type database structure and usage? In the context of this PR, are the handlers supposed to populate the database as they process the entity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I envision there being likely two type databases. The first is a bind-time database which would need maximal information about the types: Swift module, Swift name, C# namespace, C# name, entity type, blitability. But this is something that will be in its own document. The database would be populated with new entries after the ability.json file has been parsed but before binding. Older entries can be read in before the abi.json file is read. The second database is run-time. Anything we can do to minimize the size and start-up overhead. For the most part, the run-time database is to support the generic programming model, but again, this would be it's own document.
src/docs/process-binding.md
Outdated
The complexities fall into several broad categories: | ||
- Type and member naming | ||
- Multiple types being defined in multiple languages concurrently | ||
- Marhshaling handled differently based on the type of the parameter and the type of the function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Marhshaling handled differently based on the type of the parameter and the type of the function | |
- Marshaling handled differently based on the type of the parameter and the type of the function |
src/docs/process-binding.md
Outdated
|
||
In theory, the process of binding a Swift binary into C# should be as simple as: | ||
|
||
1. Generate and consume the abi.json file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see next level of details here, similar to https://github.com/dotnet/runtimelab/tree/feature/swift-bindings/docs#functional-outline. It is not clear how dependencies are resolved.
src/docs/process-binding.md
Outdated
In theory, the process of binding a Swift binary into C# should be as simple as: | ||
|
||
1. Generate and consume the abi.json file | ||
2. Extract and demangle the symbols from the binary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to consider this step as optional? For simpler bindings, this may not be necessary. Also, users might not know how to retrieve it from a framework.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure about the UX of making it optional. The user will run the code without this step and then get a failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured that since we need to know the path to the dylib to get the abi.json, we're already there. Since we will need metadata accessors, we need the demangling there.
|
||
All of this makes the process of generating code challenging. | ||
|
||
The complexities fall into several broad categories: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to reflect on the components of our tooling. While having everything in a single project might seem easier, splitting them into multiple projects would force cleaner integration and make testing more granular.
src/docs/process-binding.md
Outdated
- Implicit arguments | ||
- Versioning based on the `@available` attribute. | ||
|
||
For this reason I strongly recommend using code-generation tools that can work in a non-linear fashion. There are several ways to achieve this, but I would strongly recommend using the Dynamo framework from Binding Tools for Swift as it can handle both C# and Swift and generates non-linearly. In addition, it shouldn't be a stretch to generate code in parallel on type boundaries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would start by identifying limitations with the string-based emitter. Based on those, we can add a thin model layer if needed. I think that the emitter itself shouldn't handle marshalling; that should be done before emitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am all for reusing code which already works. However, I think that two things would be useful first:
- Describing the limitations as Milos noted
- Describing what Dynamo is, and how it will solve those limitations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
|
||
Because of this, I think we should adopt a strategy and factory pattern for handlers at various levels. | ||
|
||
The general pattern would work like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach with custom handlers :)
public func generateAClass(String name) -> SomeClass { } | ||
``` | ||
The process would look at this and identify this as a top-level function and will select a handler factory for it. | ||
The handler will create a context for the object which would include a class for the top-level object to live in (C# doesn't have top-level functions) and a class to hold top-level pinvokes and a function generation context which would include a place to place function argument declarations, generic declarations, function argument pre-marshaling code, pinvoke argument declarations, pinvoke argument expressions, post-marshaling code, return type declaration, and a return expression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to hold the most of information in declarations and the registrar.
src/docs/process-binding.md
Outdated
The handler will create a context for the object which would include a class for the top-level object to live in (C# doesn't have top-level functions) and a class to hold top-level pinvokes and a function generation context which would include a place to place function argument declarations, generic declarations, function argument pre-marshaling code, pinvoke argument declarations, pinvoke argument expressions, post-marshaling code, return type declaration, and a return expression. | ||
|
||
The handler will execute a step to name the function and the associated pinvoke, including the entry point and library. | ||
Then for each argument, it will gather information about each argument and from the function handler get a factory to build an argument handler for type `String`. This will in turn name the argument, generate the C# type and add it to the C# argument declaration. It will define the argument type for the pinvoke and add it to the C# pinvoke argument list. If needed, it will generate premarshal code and add it to the premarshal list and post marshal code, and finally an expression for calling the pinvoke. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then for each argument, it will gather information about each argument and from the function handler get a factory to build an argument handler for type `String`. This will in turn name the argument, generate the C# type and add it to the C# argument declaration. It will define the argument type for the pinvoke and add it to the C# pinvoke argument list. If needed, it will generate premarshal code and add it to the premarshal list and post marshal code, and finally an expression for calling the pinvoke. | |
Then for each argument, it will gather the necessary information and from the function handler get a factory to build an argument handler for type `String`. This will in turn name the argument, generate the C# type and add it to the C# argument declaration. It will define the argument type for the pinvoke and add it to the C# pinvoke argument list. If needed, it will generate premarshal code and add it to the premarshal list and post marshal code, and finally an expression for calling the pinvoke. |
src/docs/process-binding.md
Outdated
In theory, the process of binding a Swift binary into C# should be as simple as: | ||
|
||
1. Generate and consume the abi.json file | ||
2. Extract and demangle the symbols from the binary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure about the UX of making it optional. The user will run the code without this step and then get a failure?
|
||
All of this makes the process of generating code challenging. | ||
|
||
The complexities fall into several broad categories: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do those complexities impose additional steps over the five you described, or they just make the steps more complicated?
src/docs/process-binding.md
Outdated
- Implicit arguments | ||
- Versioning based on the `@available` attribute. | ||
|
||
For this reason I strongly recommend using code-generation tools that can work in a non-linear fashion. There are several ways to achieve this, but I would strongly recommend using the Dynamo framework from Binding Tools for Swift as it can handle both C# and Swift and generates non-linearly. In addition, it shouldn't be a stretch to generate code in parallel on type boundaries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am all for reusing code which already works. However, I think that two things would be useful first:
- Describing the limitations as Milos noted
- Describing what Dynamo is, and how it will solve those limitations
src/docs/process-binding.md
Outdated
2. Aggregate information about that entity | ||
3. Select a factory to create a handler for that entity | ||
4. The handler will generate a context object for handling that entity | ||
5. Execute a series of steps through the handler that will do work apropriate for each step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT, same on the line below :)
5. Execute a series of steps through the handler that will do work apropriate for each step. | |
5. Execute a series of steps through the handler that will do work apropriate for each step |
src/docs/process-binding.md
Outdated
The handler will create a context for the object which would include a class for the top-level object to live in (C# doesn't have top-level functions) and a class to hold top-level pinvokes and a function generation context which would include a place to place function argument declarations, generic declarations, function argument pre-marshaling code, pinvoke argument declarations, pinvoke argument expressions, post-marshaling code, return type declaration, and a return expression. | ||
|
||
The handler will execute a step to name the function and the associated pinvoke, including the entry point and library. | ||
Then for each argument, it will gather information about each argument and from the function handler get a factory to build an argument handler for type `String`. This will in turn name the argument, generate the C# type and add it to the C# argument declaration. It will define the argument type for the pinvoke and add it to the C# pinvoke argument list. If needed, it will generate premarshal code and add it to the premarshal list and post marshal code, and finally an expression for calling the pinvoke. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a bit lost I think. So the handler
doing all the job will be the Top Level functions handler
? And then the Top Level function handler
will call function handler
to get a factory
which will build an argument handler
?
|
||
A similar process will be done for handling the return type and value. In this case, the pinvoke return type will be a `NativeHandle` and it will be used in conjunction with a registry to either retrieve an already existing C# object that is bound to that handle or it will build one through a factory. | ||
|
||
After all this is done, the function handler will finish up by aggregating all the information, writing the C# method and writing the C# pinvoke. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this example we do not need to generate Swift code, but what about cases where we will need to do this. Will it happen here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to do so. The reason being that in any cases where marshaling is not 1:1 with C# capabilities, we will need to change the way that parameters are handled. I'll expound on this more in the docs because it will make it clear why non-linear code writing will make tasks much easier.
A general concern I have is that we haven't really described when/how are we going to generate Swift and how are we going to use/ship it. On top of that, trimming is still a major concern for me. Especially the fact that we don't have a good way to trim generated swift code. Other interops solve this problem by generating the native code (in this case Swift) as basically the last step of building an app after it's known what interop is needed by the app. This means that the swift generation would happen at app-build time and not during the projection tool runtime. There are probably other ways to solve it, but we haven't really discussed these yet. I think we should solve these:
|
@vitek-karas - I understand your concern about trimming and packaging. I think both of those things are beyond the scope of this particular document.
|
This is a document to describe the process of binding Swift entities in C# and propose an architecture for handling it cleanly.