Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tokio: remove ref of scheduler in owned task #5992

Closed

Conversation

wathenjiang
Copy link
Contributor

@wathenjiang wathenjiang commented Sep 9, 2023

Motivation

I belive we can remove the Arc::clone() call by removing the scheduler handle reference in all tasks.

For those tasks are owned, we could always use the scheduler in thread local CONTEXT.

Solution

By removing ref(Arc) of scheduler in owned tasks, we could got a better performance, espically when we call tokio::spawn very frequently in multiple threads.

And we keep ref of scheduler in unowned tasks, which do not depend on the sheduler in thread local CONTEXT.

The cargo build can be more quickly by removing the generic parameter S.

Here are the benchmark results of spawn2.

The current version:

Gnuplot not found, using plotters backend
spawn_tasks_current_thread
                        time:   [673.18 ns 680.79 ns 687.52 ns]
                        change: [-6.2656% -2.7726% +1.1065%] (p = 0.15 > 0.05)
                        No change in performance detected.

spawn_tasks_current_thread_parallel
                        time:   [521.01 ns 521.75 ns 522.35 ns]
                        change: [-1.4901% -0.6725% +0.1920%] (p = 0.12 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe

spawn_tasks             time:   [966.28 ns 976.72 ns 985.98 ns]
                        change: [+0.1574% +1.6134% +2.9634%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild

spawn_tasks_parallel    time:   [893.21 ns 898.49 ns 903.49 ns]
                        change: [+0.3002% +1.3720% +2.4327%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

This PR version:

# cargo bench --bench spawn2
    Finished bench [optimized] target(s) in 0.07s
     Running spawn2.rs (/root/rustproject/fork_tokio/target/release/deps/spawn2-b48a6368b3e3dabc)
Gnuplot not found, using plotters backend
spawn_tasks_current_thread
                        time:   [650.39 ns 662.56 ns 672.59 ns]
                        change: [-6.8428% -3.3240% +0.2419%] (p = 0.08 > 0.05)
                        No change in performance detected.

spawn_tasks_current_thread_parallel
                        time:   [526.50 ns 527.29 ns 528.10 ns]
                        change: [-1.9903% -1.1432% -0.2422%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 22 outliers among 100 measurements (22.00%)
  6 (6.00%) low severe
  11 (11.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe

spawn_tasks             time:   [902.32 ns 911.90 ns 921.01 ns]
                        change: [-1.1438% +0.5670% +2.4469%] (p = 0.54 > 0.05)
                        No change in performance detected.

spawn_tasks_parallel    time:   [819.78 ns 826.17 ns 832.07 ns]
                        change: [-2.1597% -0.5893% +1.1337%] (p = 0.50 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) low mild
  2 (2.00%) high severe

@github-actions github-actions bot added R-loom-current-thread Run loom current-thread tests on this PR R-loom-multi-thread Run loom multi-thread tests on this PR R-loom-multi-thread-alt Run loom multi-thread alt tests on this PR labels Sep 9, 2023
@wathenjiang wathenjiang force-pushed the remove_ref_of_scheduler branch from fac89f8 to 0126a7c Compare September 10, 2023 01:34
@wathenjiang wathenjiang force-pushed the remove_ref_of_scheduler branch from a274d68 to 4326c7a Compare September 10, 2023 01:52
@Darksonn Darksonn added A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime labels Sep 10, 2023
Copy link
Contributor

@Darksonn Darksonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ask us before you spend a lot of time implementing something. I don't think this PR is correct, and I don't think it can be fixed. Unless there's something I'm missing, we will have to close this.

Comment on lines 279 to 288
unsafe fn schedule<S: Schedule>(ptr: NonNull<Header>) {
use crate::runtime::task::{Notified, Task};

let scheduler = Header::get_scheduler::<S>(ptr);
scheduler
.as_ref()
.schedule(Notified(Task::from_raw(ptr.cast())));
let scheduler = Header::get_scheduler::<S>(ptr).as_ref();
let notify = Notified(Task::from_raw(ptr.cast()));
match scheduler {
Some(scheduler) => scheduler.schedule(notify),
None => crate::task::schedule::schedule(notify),
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks incorrect. Consider this situation: An mpsc channel has the sender in runtime A, and the receiver in runtime B. Then, we send a message on the channel from runtime A. This causes the sender to call wake on the waker stored by the receiver, which corresponds to a task from runtime B, but we are on the thread for runtime A. That call eventually hits this function, and then the None branch of the match. This causes the task from runtime B to get scheduled on runtime A.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a test failure due to something similar:

thread 'runtime::tests::task_combinations::test_combinations' panicked at 'there is no reactor running, must be called from the context of a Tokio 1.x runtime', tokio/src/task/schedule.rs:19:19
stack backtrace:
   0: rust_begin_unwind
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:67:14
   2: core::panicking::panic_display
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:150:5
   3: tokio::task::schedule::schedule
             at ./src/task/schedule.rs:19:19
   4: tokio::runtime::task::raw::schedule
             at ./src/runtime/task/raw.rs:286:17
   5: tokio::runtime::task::raw::RawTask::schedule
             at ./src/runtime/task/raw.rs:205:18
   6: tokio::runtime::task::harness::<impl tokio::runtime::task::raw::RawTask>::wake_by_ref
             at ./src/runtime/task/harness.rs:103:17
   7: tokio::runtime::task::waker::wake_by_ref
             at ./src/runtime/task/waker.rs:93:5
   8: core::task::wake::Waker::wake_by_ref
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/task/wake.rs:296:18
   9: core::ops::function::FnOnce::call_once
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/ops/function.rs:250:5
  10: tokio::sync::oneshot::Task::with_task::{{closure}}
             at ./src/sync/oneshot.rs:417:13
  11: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with
             at ./src/loom/std/unsafe_cell.rs:11:9
  12: tokio::sync::oneshot::Task::with_task
             at ./src/sync/oneshot.rs:415:9
  13: tokio::sync::oneshot::Inner<T>::complete
             at ./src/sync/oneshot.rs:1127:17
  14: tokio::sync::oneshot::Sender<T>::send
             at ./src/sync/oneshot.rs:608:13
  15: tokio::runtime::tests::task_combinations::test_combination
             at ./src/runtime/tests/task_combinations.rs:396:13
  16: tokio::runtime::tests::task_combinations::test_combinations
             at ./src/runtime/tests/task_combinations.rs:119:37
  17: tokio::runtime::tests::task_combinations::test_combinations::{{closure}}
             at ./src/runtime/tests/task_combinations.rs:68:24
  18: core::ops::function::FnOnce::call_once
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/ops/function.rs:250:5
  19: core::ops::function::FnOnce::call_once
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Here, the test calls oneshot::Sender::send while it is not inside the runtime context. This panics because the thread-local is not available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got stuck on this this afternoon: The threads without context are hard to talk to the thread with context. This means the reference of scheduler have to put in something like channel, or in the tasks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I belive the message pass between sync threads and tokio runtime threads is a key capability, so the reference of scheduler in every task may have to be kept. To remove this reference, we may have to write a new tokio::spawn method like tokio::spawn_without_...., that is not a good way.

@wathenjiang
Copy link
Contributor Author

I believe this PR cannot resolve the above problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime R-loom-current-thread Run loom current-thread tests on this PR R-loom-multi-thread Run loom multi-thread tests on this PR R-loom-multi-thread-alt Run loom multi-thread alt tests on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants