Calling the Tokio runtime via CGO/FFI from multiple goroutines #5840
Replies: 2 comments 1 reply
-
After posting all this, I did just notice that there's a It appears that |
Beta Was this translation helpful? Give feedback.
-
One thing to be careful about with FFI is that you can end up with several copies of Tokio if you link in several different Rust projects using Tokio. They will have different globals for the Tokio state, so they wont work together. |
Beta Was this translation helpful? Give feedback.
-
Background
I have an unusual use case which I've struggled to draw concrete conclusions about, and whose answer I think will help elucidate Tokio internals more broadly. I'm working on an application where I'm calling Rust from Go via the FFI. We're integrating a library which makes async calls (https://github.com/Devolutions/IronRDP) and I'm attempting to get it to run via a Tokio runtime.
For the purposes of this discussion I will create a simplified example that illustrates what I'm working with and what different approaches I've tried. In Rust I have a struct like the following:
At a high level, the system works as follows
Client
by callinginit_client
, which creates a Tokio runtime (default settings viaRuntime::new()
), connects atokio::net::TcpStream
, and passes those intoClient::new(tcp, tokio_rt)
. ThatClient
is then passed as a pointer back over the FFI for later use.loop_and_read(client: *mut Client)
which loops to read from theclient.iron_rdp_client.tcp
stream asynchronously, occasionally calling back into Go.write(client: *mut Client, data: *mut u8,)
which asynchronously writesdata
to theclient.iron_rdp_client.tcp
With this high level structure I've tried several different approaches, with varying results, and I'm seeking guidance as to which is ideal:
Try 1: a new
Runtime
for each stepThe first thing I tried slightly breaks the three step description above, in that I created a new
Runtime
and calledblock_on
for each of 1, 2, and 3. This led to some inscrutablemio
error, which I eventually decided probably was related to trying to use a singletokio::net::TcpStream
on multipleRuntime
s, which led me to the attempts described below.Try 2:
block_on
in each stepMotivation
Once I decided to try everything on a single `Runtime`, my first "apparently working" attempt was to call `block_on` in each of the above steps. That looked something like1.
init_client
withblock_on
(Called before either 2 or 3 below)
Code
2.
loop_and_read
withblock_on
(Called concurrently via goroutine with 3 below)
Code
3.
write
withblock_on
(Called occasionally from Go via goroutine, concurrent with 2 above)
Code
Discussion
This approach seemed to be working on my machine, however after further research I began doubting whether it would work consistently across platforms. My understanding is that `block_on` blocks on whatever OS thread it's called from. This is fine for (1), which is never called concurrent with another `block_on`, however (2) and (3) _are_ called concurrently (via two different goroutines). While goroutines can and often are called on different threads, there's no semantic guarantee that they are. If the Go scheduler were to happen to schedule a call to (3) on the thread that (2) was already blocking on, then that would mean (3) never gets a chance to execute.However now that I've written that out, I don't think I'm making sense. When a function is called from CGO, it gets "locked" to a particular thread, which is taken out of the scheduler pool (source). Ergo there's no way for that to happen, and I'm therefore just calling
handle().clone().block_on()
from different threads, which as far as I can tell, is defined behavior (see the second example here).All that said, in my exploration I still discovered something interesting which I'm seeking an answer for, so I will continue below.
Try 3:
block_on
in (1),spawn
in (2) and (3)Motivation
I noticed that the [`block_on` documentation](https://docs.rs/tokio/latest/tokio/runtime/struct.Runtime.html#method.block_on) statesSo in order to get better performance I decided to switch (2) and (3) to use
spawn
1.
init_client
withblock_on
See "Try 2", heading "1.
init_client
withblock_on
" above2.
loop_and_read
withspawn
(Called concurrently via goroutine with 3 below)
Code
3.
write
withspawn
(Called occasionally from Go via goroutine, concurrent with 2 above)
Code
Discussion
This approach also appears to work for me, and is in a sense more in line with standard tokio usage examples. For example, the tutorial gives the example
Given that
#[tokio::main]
is syntactic-sugar for creating a defaultRuntime
and callingblock_on
, my approach here in Try 3 has the same sequence of events ofblock_on
-->spawn
. The primary difference is that in the examples,spawn
is always called downstream of a higher level call toblock_on
, whereas in my case,block_on
is called, then completes, and only then, later, isspawn
called.Try 4:
spawn
in each stepMotivation
This success got me thinking -- if `spawn` is more performant (and I already have this little channel trick worked out), why not just use `spawn` for all of these steps?1.
init_client
withspawn
(Called before either 2 or 3 below)
Code
2.
loop_and_read
withspawn
See "Try 3", heading "2.
loop_and_read
withspawn
" above3.
write
withspawn
See "Try 3", heading "3.
write
withspawn
" aboveDiscussion
In this case, the system fails. By adding some logging, I found that I would essentially get "stuck" in
init_client
, somewhere in the// Do some other async/await stuff
. I really don't know what's going on -- why would spawn allow it to execute some distance, including across a fewawait
boundaries, before getting stuck?Discussion and Questions
So it appears to me that the "Try 2" approach should work, and isn't doing anything particularly undefined. However it's relatively inefficient compared to what's possible using
spawn
. "Try 3" usesspawn
and appears to work, but it's done in an unconventional way for which I couldn't find any documentation on, and so I'm concerned that it may cause issues down the road. And "Try 4" may offer a hint as to whether "Try 3" makes ultimate sense.The main questions that pop out at me from these results are -- is calling
block_on
doing something special? For example, is it initializing some aspect of theRuntime
's internals that allows laterspawns
to execute successfully? And if so, do those laterspawn
s somehow rely on being further down the callstack of theblock_on
(like in the examples)? Or isblock_on
having been called previously and then returned from enough to do the trick (like in my "Try 3")?Beta Was this translation helpful? Give feedback.
All reactions