-
Notifications
You must be signed in to change notification settings - Fork 709
Open
Description
The nightly CI build fails due the long standing possible circular locking dependency detected. IIRC, there was not a lot of progress on this for a quite a while. Maybe we should have something which annotates the tests and doesn't fail it completely. Obviously, the best thing would be to fix it...
CC: @kawasaki
[ 704.361362] run blktests nvme/005 at 2026-02-25 03:47:54
[ 704.532546] loop0: detected capacity change from 0 to 2097152
[ 704.595709] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 704.671982] nvmet_tcp: enabling port 0 (10.0.2.2:4420)
[ 704.856746] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[ 704.872976] nvme nvme2: creating 4 I/O queues.
[ 704.889641] nvme nvme2: mapped 4/0/0 default/read/poll queues.
[ 704.902269] nvme nvme2: new ctrl: NQN "blktests-subsystem-1", addr 10.0.2.2:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
[ 705.442498] ======================================================
[ 705.443855] WARNING: possible circular locking dependency detected
[ 705.445329] 7.0.0-rc1 #1 Tainted: G E
[ 705.446756] ------------------------------------------------------
[ 705.448647] kworker/0:1/5932 is trying to acquire lock:
[ 705.450324] ffffffff86625be0 (fs_reclaim){+.+.}-{0:0}, at: kmem_cache_alloc_node_noprof+0x54/0x6e0
[ 705.451991]
but task is already holding lock:
[ 705.454264] ffff888119164218 (sk_lock-AF_INET-NVME){+.+.}-{0:0}, at: tcp_close+0x18/0xc0
[ 705.456269]
which lock already depends on the new lock.
[ 705.459875]
the existing dependency chain (in reverse order) is:
[ 705.461860]
-> #4 (sk_lock-AF_INET-NVME){+.+.}-{0:0}:
[ 705.463431] __lock_acquire+0x55d/0xc00
[ 705.464332] lock_acquire.part.0+0xb7/0x220
[ 705.465291] lock_sock_nested+0x3b/0xe0
[ 705.466166] tcp_sendmsg+0x1d/0x40
[ 705.466997] sock_sendmsg+0x2fb/0x410
[ 705.467853] nvme_tcp_try_send_cmd_pdu+0x5a0/0xc10 [nvme_tcp]
[ 705.468922] nvme_tcp_try_send+0x1b6/0x9b0 [nvme_tcp]
[ 705.469884] nvme_tcp_queue_rq+0x449/0x6d0 [nvme_tcp]
[ 705.470828] blk_mq_dispatch_rq_list+0x38f/0x14a0
[ 705.471768] __blk_mq_sched_dispatch_requests+0x1a9/0x4b0
[ 705.472740] blk_mq_sched_dispatch_requests+0xac/0x150
[ 705.473686] blk_mq_run_hw_queue+0x1c7/0x500
[ 705.474541] blk_execute_rq+0x16a/0x310
[ 705.475360] __nvme_submit_sync_cmd+0x108/0x320 [nvme_core]
[ 705.476328] nvmf_connect_io_queue+0x1c7/0x2f0 [nvme_fabrics]
[ 705.477287] nvme_tcp_start_queue+0x4eb/0x6a0 [nvme_tcp]
[ 705.478195] nvme_tcp_configure_io_queues.cold+0x4b6/0x9ea [nvme_tcp]
[ 705.479177] nvme_tcp_setup_ctrl+0x281/0x5f0 [nvme_tcp]
[ 705.480066] nvme_tcp_create_ctrl+0xf6/0x120 [nvme_tcp]
[ 705.480952] nvmf_create_ctrl+0x34c/0x920 [nvme_fabrics]
[ 705.481878] nvmf_dev_write+0xd5/0x180 [nvme_fabrics]
[ 705.482771] vfs_write+0x1c7/0x1090
[ 705.483534] ksys_write+0xf9/0x1d0
[ 705.484265] do_syscall_64+0x133/0x15c0
[ 705.485070] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 705.485953]
-> #3 (set->srcu){.+.+}-{0:0}:
[ 705.487198] __lock_acquire+0x55d/0xc00
[ 705.487947] lock_sync.part.0+0x41/0x80
[ 705.488687] __synchronize_srcu+0xa7/0x240
[ 705.489462] elevator_switch+0x2a8/0x660
[ 705.490199] elevator_change+0x2d5/0x4e0
[ 705.490912] elevator_set_none+0x8b/0xe0
[ 705.491623] blk_unregister_queue+0x144/0x2c0
[ 705.492369] __del_gendisk+0x265/0xa20
[ 705.493069] del_gendisk+0x105/0x190
[ 705.493752] nvme_ns_remove+0x32c/0x8f0 [nvme_core]
[ 705.494542] nvme_remove_namespaces+0x267/0x3b0 [nvme_core]
[ 705.495384] nvme_do_delete_ctrl+0xf5/0x160 [nvme_core]
[ 705.496200] nvme_delete_ctrl_sync.cold+0x8/0xd [nvme_core]
[ 705.497032] nvme_sysfs_delete+0x96/0xc0 [nvme_core]
[ 705.497821] kernfs_fop_write_iter+0x3df/0x5f0
[ 705.498572] vfs_write+0x518/0x1090
[ 705.499219] ksys_write+0xf9/0x1d0
[ 705.499855] do_syscall_64+0x133/0x15c0
[ 705.500526] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 705.501301]
-> #2 (&q->elevator_lock){+.+.}-{4:4}:
[ 705.502424] __lock_acquire+0x55d/0xc00
[ 705.503092] lock_acquire.part.0+0xb7/0x220
[ 705.503785] __mutex_lock+0x1a6/0x23c0
[ 705.504449] elevator_change+0x189/0x4e0
[ 705.505122] elv_iosched_store+0x32e/0x3b0
[ 705.505817] queue_attr_store+0x232/0x2f0
[ 705.506501] kernfs_fop_write_iter+0x3df/0x5f0
[ 705.507213] vfs_write+0x518/0x1090
[ 705.507855] ksys_write+0xf9/0x1d0
[ 705.508484] do_syscall_64+0x133/0x15c0
[ 705.509160] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 705.509918]
-> #1 (&q->q_usage_counter(io)#4){++++}-{0:0}:
[ 705.511080] __lock_acquire+0x55d/0xc00
[ 705.511745] lock_acquire.part.0+0xb7/0x220
[ 705.512435] blk_alloc_queue+0x5b7/0x730
[ 705.513101] blk_mq_alloc_queue+0x14b/0x230
[ 705.513792] scsi_alloc_sdev+0x851/0xcc0
[ 705.514471] scsi_probe_and_add_lun+0x71e/0xc30
[ 705.515177] __scsi_add_device+0x1c4/0x1f0
[ 705.515852] ata_scsi_scan_host+0x13d/0x390
[ 705.516544] async_run_entry_fn+0x9a/0x510
[ 705.517233] process_one_work+0x854/0x1540
[ 705.517908] worker_thread+0x5f3/0xfe0
[ 705.518555] kthread+0x35d/0x460
[ 705.519159] ret_from_fork+0x556/0x850
[ 705.519818] ret_from_fork_asm+0x1a/0x30
[ 705.520492]
-> #0 (fs_reclaim){+.+.}-{0:0}:
[ 705.521529] check_prev_add+0xeb/0xc50
[ 705.522172] validate_chain+0x468/0x550
[ 705.522824] __lock_acquire+0x55d/0xc00
[ 705.523474] lock_acquire.part.0+0xb7/0x220
[ 705.524152] fs_reclaim_acquire+0xc9/0x110
[ 705.524826] kmem_cache_alloc_node_noprof+0x54/0x6e0
[ 705.525572] __alloc_skb+0xe3/0x5b0
[ 705.526201] tcp_send_active_reset+0x8a/0x750
[ 705.526898] tcp_disconnect+0x16c6/0x1f60
[ 705.527565] __tcp_close+0x7d7/0xee0
[ 705.528195] tcp_close+0x23/0xc0
[ 705.528796] inet_release+0x109/0x230
[ 705.529447] __sock_release+0xa6/0x250
[ 705.530091] sock_close+0x18/0x20
[ 705.530700] __fput+0x372/0xaa0
[ 705.531292] delayed_fput+0x6d/0x90
[ 705.531908] process_one_work+0x854/0x1540
[ 705.532575] worker_thread+0x5f3/0xfe0
[ 705.533220] kthread+0x35d/0x460
[ 705.533817] ret_from_fork+0x556/0x850
[ 705.534454] ret_from_fork_asm+0x1a/0x30
[ 705.535101]
other info that might help us debug this:
[ 705.536558] Chain exists of:
fs_reclaim --> set->srcu --> sk_lock-AF_INET-NVME
[ 705.538229] Possible unsafe locking scenario:
[ 705.539273] CPU0 CPU1
[ 705.539934] ---- ----
[ 705.540591] lock(sk_lock-AF_INET-NVME);
[ 705.541217] lock(set->srcu);
[ 705.541961] lock(sk_lock-AF_INET-NVME);
[ 705.542777] lock(fs_reclaim);
[ 705.543328]
*** DEADLOCK ***
[ 705.544617] 4 locks held by kworker/0:1/5932:
[ 705.545271] #0: ffff8881001dad48 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xfa9/0x1540
[ 705.546337] #1: ffff88816248fd10 ((delayed_fput_work).work){+.+.}-{0:0}, at: process_one_work+0x7dc/0x1540
[ 705.547428] #2: ffff88812aa28d88 (&sb->s_type->i_mutex_key#11){+.+.}-{4:4}, at: __sock_release+0x7b/0x250
[ 705.548521] #3: ffff888119164218 (sk_lock-AF_INET-NVME){+.+.}-{0:0}, at: tcp_close+0x18/0xc0
[ 705.549536]
stack backtrace:
[ 705.550507] CPU: 0 UID: 0 PID: 5932 Comm: kworker/0:1 Tainted: G E 7.0.0-rc1 #1 PREEMPT(lazy)
[ 705.550514] Tainted: [E]=UNSIGNED_MODULE
[ 705.550516] Hardware name: KubeVirt None/RHEL, BIOS 1.16.3-2.el9 04/01/2014
[ 705.550519] Workqueue: events delayed_fput
[ 705.550530] Call Trace:
[ 705.550543] <TASK>
[ 705.550547] dump_stack_lvl+0x6e/0xa0
[ 705.550554] print_circular_bug.cold+0x38/0x45
[ 705.550570] check_noncircular+0x144/0x160
[ 705.550582] check_prev_add+0xeb/0xc50
[ 705.550590] validate_chain+0x468/0x550
[ 705.550595] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 705.550602] __lock_acquire+0x55d/0xc00
[ 705.550611] lock_acquire.part.0+0xb7/0x220
[ 705.550618] ? kmem_cache_alloc_node_noprof+0x54/0x6e0
[ 705.550625] ? srso_alias_return_thunk+0x5/0xfbef5
[ 705.550629] ? lock_acquire+0x12c/0x150
[ 705.550637] fs_reclaim_acquire+0xc9/0x110
[ 705.550640] ? kmem_cache_alloc_node_noprof+0x54/0x6e0
[ 705.550646] kmem_cache_alloc_node_noprof+0x54/0x6e0
[ 705.550650] ? __alloc_skb+0xe3/0x5b0
[ 705.550654] ? lockdep_hardirqs_on_prepare.part.0+0x9b/0x140
[ 705.550662] __alloc_skb+0xe3/0x5b0
[ 705.550666] ? __alloc_skb+0x49e/0x5b0
[ 705.550671] ? __pfx___alloc_skb+0x10/0x10
[ 705.550677] ? srso_alias_return_thunk+0x5/0xfbef5
[ 705.550681] ? tcp_set_state+0xf3/0x510
[ 705.550688] tcp_send_active_reset+0x8a/0x750
[ 705.550696] tcp_disconnect+0x16c6/0x1f60
[ 705.550700] ? srso_alias_return_thunk+0x5/0xfbef5
[ 705.550704] ? do_raw_spin_lock+0x12d/0x270
[ 705.550711] __tcp_close+0x7d7/0xee0
[ 705.550716] ? __local_bh_enable_ip+0xac/0x120
[ 705.550730] ? trace_hardirqs_on+0x18/0x170
[ 705.550743] tcp_close+0x23/0xc0
[ 705.550748] inet_release+0x109/0x230
[ 705.550754] __sock_release+0xa6/0x250
[ 705.550762] sock_close+0x18/0x20
[ 705.550766] __fput+0x372/0xaa0
[ 705.550771] ? __lock_release.isra.0+0x5d/0x180
[ 705.550776] ? srso_alias_return_thunk+0x5/0xfbef5
[ 705.550782] delayed_fput+0x6d/0x90
[ 705.550787] process_one_work+0x854/0x1540
[ 705.550800] ? __pfx_process_one_work+0x10/0x10
[ 705.550804] ? srso_alias_return_thunk+0x5/0xfbef5
[ 705.550811] ? lock_is_held_type+0x9a/0x110
[ 705.550817] ? srso_alias_return_thunk+0x5/0xfbef5
[ 705.550824] worker_thread+0x5f3/0xfe0
[ 705.550832] ? srso_alias_return_thunk+0x5/0xfbef5
[ 705.550835] ? __kthread_parkme+0xb1/0x1f0
[ 705.550843] ? __pfx_worker_thread+0x10/0x10
[ 705.550853] kthread+0x35d/0x460
[ 705.550857] ? srso_alias_return_thunk+0x5/0xfbef5
[ 705.550860] ? __pfx_kthread+0x10/0x10
[ 705.550866] ret_from_fork+0x556/0x850
[ 705.550872] ? __pfx_ret_from_fork+0x10/0x10
[ 705.550879] ? srso_alias_return_thunk+0x5/0xfbef5
[ 705.550882] ? __switch_to+0x435/0xd80
[ 705.550894] ? __pfx_kthread+0x10/0x10
[ 705.550900] ret_from_fork_asm+0x1a/0x30
[ 705.550914] </TASK>
[ 705.595600] nvmet: Created nvm controller 2 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349.
[ 705.602818] nvme nvme2: creating 4 I/O queues.
[ 705.628770] nvme nvme2: mapped 4/0/0 default/read/poll queues.
[ 705.699976] nvme nvme2: Removing ctrl: NQN "blktests-subsystem-1"
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels