Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reactor stalled for 43ms in read_iobuf_exactly #10423

Open
ballard26 opened this issue Apr 28, 2023 · 3 comments
Open

Reactor stalled for 43ms in read_iobuf_exactly #10423

ballard26 opened this issue Apr 28, 2023 · 3 comments
Assignees
Labels
kind/bug Something isn't working reactor-stall

Comments

@ballard26
Copy link
Contributor

ballard26 commented Apr 28, 2023

In: https://ci-artifacts.dev.vectorized.cloud:443/vtools/7317/0187c1bf-8e3c-401e-bda2-49c5fa727564/vbuild/output/ducktape-release-clang-amd64.tgz

Backtrace:

WARNING:root:decoded: tmp/ducktape/results/2023-04-27--001/PartitionBalancerScaleTest/test_partition_balancer_with_many_partitions/type=big_partitions/293/RedpandaService-0-139902321644048/ip-172-31-2-158/redpanda.log:2230:Reactor stalled for 43 ms on shard 0. Backtrace:
[Backtrace #78]
{tmp/libexec/redpanda} 0x636072f: void seastar::backtrace<seastar::backtrace_buffer::append_backtrace_oneline()::'lambda'(seastar::frame)>(seastar::backtrace_buffer::append_backtrace_oneline()::'lambda'(seastar::frame)&&) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/util/backtrace.hh:60
 (inlined by) seastar::backtrace_buffer::append_backtrace_oneline() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:797
 (inlined by) seastar::print_with_backtrace(seastar::backtrace_buffer&, bool) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:816
{tmp/libexec/redpanda} 0x6361b42: seastar::internal::cpu_stall_detector::generate_trace() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:1386
 (inlined by) seastar::internal::cpu_stall_detector::maybe_report() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:1123
 (inlined by) seastar::internal::cpu_stall_detector::on_signal() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:1143
 (inlined by) seastar::reactor::block_notifier(int) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:1423
{tmp/libexec/redpanda} 0x42abf: ?? at ??:0
{tmp/libexec/redpanda} 0x11753b: ?? at ??:0
{tmp/libexec/redpanda} 0x6419d65: seastar::file_desc::recv(void*, unsigned long, int) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/core/posix.hh:238
 (inlined by) seastar::reactor_backend_uring::recv_some(seastar::pollable_fd_state&, seastar::internal::buffer_allocator*) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor_backend.cc:1883
{tmp/libexec/redpanda} 0x635442c: seastar::pollable_fd_state::recv_some(seastar::internal::buffer_allocator*) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:463
{tmp/libexec/redpanda} 0x655d90b: seastar::pollable_fd::recv_some(seastar::internal::buffer_allocator*) at /v/build/v_deps_build/seastar-prefix/src/seastar/include/seastar/core/internal/pollable_fd.hh:177
 (inlined by) seastar::net::posix_data_source_impl::get() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/net/posix-stack.cc:597
{tmp/libexec/redpanda} 0x61ed4ad: seastar::data_source::get() at /vectorized/include/seastar/core/iostream.hh:77
 (inlined by) seastar::input_stream<char>::read_up_to(unsigned long) at /vectorized/include/seastar/core/iostream-impl.hh:261
 (inlined by) read_iobuf_exactly(seastar::input_stream<char>&, unsigned long)::$_3::operator()(iobuf&, unsigned long&) const::'lambda0'()::operator()() const at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-05222bd6360c4da24-1/redpanda/redpanda/src/v/bytes/iobuf.cc:64
{tmp/libexec/redpanda} 0x61ee838: seastar::internal::do_until_state<read_iobuf_exactly(seastar::input_stream<char>&, unsigned long)::$_3::operator()(iobuf&, unsigned long&) const::'lambda'(), read_iobuf_exactly(seastar::input_stream<char>&, unsigned long)::$_3::operator()(iobuf&, unsigned long&) const::'lambda0'()>::run_and_dispose() at /vectorized/include/seastar/core/loop.hh:299
{tmp/libexec/redpanda} 0x638204f: seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:2557
 (inlined by) seastar::reactor::run_some_tasks() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3020
{tmp/libexec/redpanda} 0x6385cc7: seastar::reactor::do_run() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3189
{tmp/libexec/redpanda} 0x6383099: seastar::reactor::run() at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/reactor.cc:3072
{tmp/libexec/redpanda} 0x6298a81: seastar::app_template::run_deprecated(int, char**, std::__1::function<void ()>&&) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/app-template.cc:265
{tmp/libexec/redpanda} 0x6296b9f: seastar::app_template::run(int, char**, std::__1::function<seastar::future<int> ()>&&) at /v/build/v_deps_build/seastar-prefix/src/seastar/src/core/app-template.cc:156
{tmp/libexec/redpanda} 0x1ff05de: application::run(int, char**) at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-05222bd6360c4da24-1/redpanda/redpanda/src/v/redpanda/application.cc:330
{tmp/libexec/redpanda} 0x66b4889: main at /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-05222bd6360c4da24-1/redpanda/redpanda/src/v/redpanda/main.cc:22
{tmp/libexec/redpanda} 0x2d58f: ?? at ??:0
{tmp/libexec/redpanda} 0x2d648: ?? at ??:0
{tmp/libexec/redpanda} 0x1feaaa4: _start at ??:0

JIRA Link: CORE-1292

@ballard26 ballard26 added kind/bug Something isn't working reactor-stall labels Apr 28, 2023
@ballard26 ballard26 self-assigned this May 16, 2023
@ballard26
Copy link
Contributor Author

Haven't been able to replicate this issue. However, at 25ms for a reactor stall I am seeing a lot of false positives for kernel interrupts. One common interrupt I'm seeing is;

[Backtrace #1]
{<kernel>} 0xffffffffffffff80: 0xffffffffffffff80 (1069588464 bytes after last symbol)
{<kernel>} 0xffffffff93921401: _raw_spin_unlock_irqrestore+0x21
{<kernel>} 0xffffffff92b3887a: __wake_up_common_lock+0x8a
{<kernel>} 0xffffffff92b38fc0: __wake_up_sync_key+0x20
{<kernel>} 0xffffffff935e908b: sock_def_readable+0x3b
{<kernel>} 0xffffffff936f5601: tcp_data_ready+0x31
{<kernel>} 0xffffffff936f637c: tcp_rcv_established+0x61c
{<kernel>} 0xffffffff9370537e: tcp_v4_do_rcv+0x16e
{<kernel>} 0xffffffff93708671: tcp_v4_rcv+0xea1
{<kernel>} 0xffffffff936cf8bc: ip_protocol_deliver_rcu+0x3c
{<kernel>} 0xffffffff936cfbb1: ip_local_deliver_finish+0x81
{<kernel>} 0xffffffff936cfc4e: ip_local_deliver+0x6e
{<kernel>} 0xffffffff936cfd4f: ip_sublist_rcv_finish+0x6f
{<kernel>} 0xffffffff936cfedd: ip_sublist_rcv+0x17d
{<kernel>} 0xffffffff936d02c9: ip_list_rcv+0xf9
{<kernel>} 0xffffffff93616719: __netif_receive_skb_list_core+0x229
{<kernel>} 0xffffffff93616d2e: netif_receive_skb_list_internal+0x18e
{<kernel>} 0xffffffff9361743a: napi_complete_done+0x7a
{<kernel>} 0xffffffffc024a5c2: ena_io_poll+0x112
{<kernel>} 0xffffffff936175d3: __napi_poll+0x33
{<kernel>} 0xffffffff93617a35: net_rx_action+0x185
{<kernel>} 0xffffffff93c000d8: __softirqentry_text_start+0xd8
{<kernel>} 0xffffffff92ad1edd: __irq_exit_rcu+0x8d
{<kernel>} 0xffffffff92ad223e: irq_exit_rcu+0xe
{<kernel>} 0xffffffff9390cc2a: common_interrupt+0x4a
{<kernel>} 0xffffffff93a00de7: asm_common_interrupt+0x27

Copy link

github-actions bot commented Jul 1, 2024

This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

@github-actions github-actions bot added the stale label Jul 1, 2024
@dotnwat
Copy link
Member

dotnwat commented Jul 3, 2024

i can't imagine that read_iobuf_exactly would stall for 43ms in user space. does reactor stall reports give any indication of stall time in user vs kernel space?

@github-actions github-actions bot removed the stale label Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working reactor-stall
Projects
None yet
Development

No branches or pull requests

2 participants