Skip to content

ATF test for brk syscall #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: wipbsd-12
Choose a base branch
from
Open

Conversation

ScottieY
Copy link

No description provided.

emaste pushed a commit that referenced this pull request May 24, 2018
There's no need to perform the interrupt unbind while holding the
blkback lock, and doing so leads to the following LOR:

lock order reversal: (sleepable after non-sleepable)
 1st 0xfffff8000802fe90 xbbd1 (xbbd1) @ /usr/src/sys/dev/xen/blkback/blkback.c:3423
 2nd 0xffffffff81fdf890 intrsrc (intrsrc) @ /usr/src/sys/x86/x86/intr_machdep.c:224
stack backtrace:
#0 0xffffffff80bdd993 at witness_debugger+0x73
#1 0xffffffff80bdd814 at witness_checkorder+0xe34
#2 0xffffffff80b7d798 at _sx_xlock+0x68
#3 0xffffffff811b3913 at intr_remove_handler+0x43
#4 0xffffffff811c63ef at xen_intr_unbind+0x10f
#5 0xffffffff80a12ecf at xbb_disconnect+0x2f
#6 0xffffffff80a12e54 at xbb_shutdown+0x1e4
#7 0xffffffff80a10be4 at xbb_frontend_changed+0x54
#8 0xffffffff80ed66a4 at xenbusb_back_otherend_changed+0x14
#9 0xffffffff80a2a382 at xenwatch_thread+0x182
#10 0xffffffff80b34164 at fork_exit+0x84
#11 0xffffffff8101ec9e at fork_trampoline+0xe

Reported by:    Nathan Friess <[email protected]>
Sponsored by:   Citrix Systems R&D
emaste pushed a commit that referenced this pull request May 29, 2019
It appeared that using NET_EPOCH_WAIT() while holding global BPF lock
can lead to another panic:

spin lock 0xfffff800183c9840 (turnstile lock) held by 0xfffff80018e2c5a0 (tid 100325) too long
panic: spin lock held too long
...
#0  sched_switch (td=0xfffff80018e2c5a0, newtd=0xfffff8000389e000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2133
#1  0xffffffff80bf9912 in mi_switch (flags=256, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:439
#2  0xffffffff80c21db7 in sched_bind (td=<optimized out>, cpu=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2704
#3  0xffffffff80c34c33 in epoch_block_handler_preempt (global=<optimized out>, cr=0xfffffe00005a1a00, arg=<optimized out>)
    at /usr/src/sys/kern/subr_epoch.c:394
#4  0xffffffff803c741b in epoch_block (global=<optimized out>, cr=<optimized out>, cb=<optimized out>, ct=<optimized out>)
    at /usr/src/sys/contrib/ck/src/ck_epoch.c:416
#5  ck_epoch_synchronize_wait (global=0xfffff8000380cd80, cb=<optimized out>, ct=<optimized out>) at /usr/src/sys/contrib/ck/src/ck_epoch.c:465
#6  0xffffffff80c3475e in epoch_wait_preempt (epoch=0xfffff8000380cd80) at /usr/src/sys/kern/subr_epoch.c:513
#7  0xffffffff80ce970b in bpf_detachd_locked (d=0xfffff801d309cc00, detached_ifp=<optimized out>) at /usr/src/sys/net/bpf.c:856
#8  0xffffffff80ced166 in bpf_detachd (d=<optimized out>) at /usr/src/sys/net/bpf.c:836
#9  bpf_dtor (data=0xfffff801d309cc00) at /usr/src/sys/net/bpf.c:914

To fix this add the check to the catchpacket() that BPF descriptor was
not detached just before we acquired BPFD_LOCK().

Reported by:	slavash
Tested by:	slavash
MFC after:	1 week
emaste pushed a commit that referenced this pull request Dec 30, 2019
With WITNESS enabled we see the following warning:

    lock order reversal: (sleepable after non-sleepable)
     1st 0xffffffd0847c7210 fu540spi0 (fu540spi0) @
     /usr/home/kp/axiado/hornet-freebsd/src/sys/riscv/sifive/fu540_spi.c:297
      2nd 0xffffffc00372bb30 Clock topology lock (Clock topology lock) @
      /usr/home/kp/axiado/hornet-freebsd/src/sys/dev/extres/clk/clk.c:1137
      stack backtrace:
      #0 0xffffffc0002a579e at witness_checkorder+0xb72
      #1 0xffffffc0002a5556 at witness_checkorder+0x92a
      #2 0xffffffc000254c7a at _sx_slock_int+0x66
      #3 0xffffffc00025537a at _sx_slock+0x8
      #4 0xffffffc000123022 at clk_get_freq+0x38
      #5 0xffffffc0005463e4 at __clzdi2+0x2bb8
      #6 0xffffffc00014af58 at randomdev_getkey+0x76e
      #7 0xffffffc0001278b0 at simplebus_add_device+0x7ee
      #8 0xffffffc00027c9a8 at device_attach+0x2e6
      #9 0xffffffc00027c634 at device_probe_and_attach+0x7a
      #10 0xffffffc00027d76a at bus_generic_attach+0x10
      #11 0xffffffc00014aab0 at randomdev_getkey+0x2c6
      #12 0xffffffc00027c9a8 at device_attach+0x2e6
      #13 0xffffffc00027c634 at device_probe_and_attach+0x7a
      freebsd#14 0xffffffc00027d76a at bus_generic_attach+0x10
      freebsd#15 0xffffffc000278bd2 at config_intrhook_oneshot+0x52
      freebsd#16 0xffffffc000278b3e at config_intrhook_establish+0x146
      freebsd#17 0xffffffc000278cf2 at config_intrhook_disestablish+0xfe

The clock topology lock can sleep, which means we cannot attempt to
acquire it while holding the non-sleepable mutex.

Fix that by retrieving the clock speed once, during attach and not every
time during SPI transaction setup.

Submitted by:   kp
Sponsored by:   Axiado
emaste pushed a commit that referenced this pull request Aug 22, 2020
  Instantiate Error in Target::GetEntryPointAddress() only when
  necessary

  When Target::GetEntryPointAddress() calls
  exe_module->GetObjectFile()->GetEntryPointAddress(), and the returned
  entry_addr is valid, it can immediately be returned.

  However, just before that, an llvm::Error value has been setup, but
  in this case it is not consumed before returning, like is done
  further below in the function.

  In https://bugs.freebsd.org/248745 we got a bug report for this,
  where a very simple test case aborts and dumps core:

  * thread #1, name = 'testcase', stop reason = breakpoint 1.1
      frame #0: 0x00000000002018d4 testcase`main(argc=1, argv=0x00007fffffffea18) at testcase.c:3:5
     1    int main(int argc, char *argv[])
     2    {
  -> 3        return 0;
     4    }
  (lldb) p argc
  Program aborted due to an unhandled Error:
  Error value was Success. (Note: Success values must still be checked prior to being destroyed).

  Thread 1 received signal SIGABRT, Aborted.
  thr_kill () at thr_kill.S:3
  3       thr_kill.S: No such file or directory.
  (gdb) bt
  #0  thr_kill () at thr_kill.S:3
  #1  0x00000008049a0004 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
  #2  0x0000000804916229 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
  #3  0x000000000451b5f5 in fatalUncheckedError () at /usr/src/contrib/llvm-project/llvm/lib/Support/Error.cpp:112
  #4  0x00000000019cf008 in GetEntryPointAddress () at /usr/src/contrib/llvm-project/llvm/include/llvm/Support/Error.h:267
  #5  0x0000000001bccbd8 in ConstructorSetup () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:67
  #6  0x0000000001bcd2c0 in ThreadPlanCallFunction () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:114
  #7  0x00000000020076d4 in InferiorCallMmap () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/Utility/InferiorCallPOSIX.cpp:97
  #8  0x0000000001f4be33 in DoAllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessFreeBSD.cpp:604
  #9  0x0000000001fe51b9 in AllocatePage () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:347
  #10 0x0000000001fe5385 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:383
  #11 0x0000000001974da2 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2301
  #12 CanJIT () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2331
  #13 0x0000000001a1bf3d in Evaluate () at /usr/src/contrib/llvm-project/lldb/source/Expression/UserExpression.cpp:190
  freebsd#14 0x00000000019ce7a2 in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Target/Target.cpp:2372
  freebsd#15 0x0000000001ad784c in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:414
  freebsd#16 0x0000000001ad86ae in DoExecute () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:646
  freebsd#17 0x0000000001a5e3ed in Execute () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandObject.cpp:1003
  freebsd#18 0x0000000001a6c4a3 in HandleCommand () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:1762
  freebsd#19 0x0000000001a6f98c in IOHandlerInputComplete () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2760
  freebsd#20 0x0000000001a90b08 in Run () at /usr/src/contrib/llvm-project/lldb/source/Core/IOHandler.cpp:548
  freebsd#21 0x00000000019a6c6a in ExecuteIOHandlers () at /usr/src/contrib/llvm-project/lldb/source/Core/Debugger.cpp:903
  freebsd#22 0x0000000001a70337 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2946
  freebsd#23 0x0000000001d9d812 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/API/SBDebugger.cpp:1169
  freebsd#24 0x0000000001918be8 in MainLoop () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:675
  freebsd#25 0x000000000191a114 in main () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:890

  Fix the incorrect error catch by only instantiating an Error object
  if it is necessary.

  Reviewed By: JDevlieghere

  Differential Revision: https://reviews.llvm.org/D86355

This should fix lldb aborting as described in the scenario above.

Reported by:	dmgk
PR:		248745
emaste pushed a commit that referenced this pull request Jun 2, 2021
- Allow firmware downloading for hw_variant #8;
- Enter manufacturer mode for setting of event mask;
- Handle multi-event response on HCI commands for 7260;
  This allows to remove kludge with skipping of 0xfc2f opcode.
- Disable patch and exit manufacturer mode on downloading failure;
- Use default firmware if correct firmware file is not found;

Reviewed by:	Philippe Michaud-Boudreault <pitwuu_AT_gmail_DOT_com>
MFC after:	1 week
Tested by:	arrowd
Differential revision:	https://reviews.freebsd.org/D30543
emaste pushed a commit that referenced this pull request Jul 11, 2021
- Allow firmware downloading for hw_variant #8;
- Enter manufacturer mode for setting of event mask;
- Handle multi-event response on HCI commands for 7260;
  This allows to remove kludge with skipping of 0xfc2f opcode.
- Disable patch and exit manufacturer mode on downloading failure;
- Use default firmware if correct firmware file is not found;

Reviewed by:	Philippe Michaud-Boudreault <pitwuu_AT_gmail_DOT_com>
Tested by:	arrowd
Differential revision:	https://reviews.freebsd.org/D30543

(cherry picked from commit da93a73)
emaste pushed a commit that referenced this pull request Nov 10, 2021
Merge commit 1ce07cd614be from llvm git (by me):

  Instantiate Error in Target::GetEntryPointAddress() only when
  necessary

  When Target::GetEntryPointAddress() calls
  exe_module->GetObjectFile()->GetEntryPointAddress(), and the returned
  entry_addr is valid, it can immediately be returned.

  However, just before that, an llvm::Error value has been setup, but
  in this case it is not consumed before returning, like is done
  further below in the function.

  In https://bugs.freebsd.org/248745 we got a bug report for this,
  where a very simple test case aborts and dumps core:

  * thread #1, name = 'testcase', stop reason = breakpoint 1.1
      frame #0: 0x00000000002018d4 testcase`main(argc=1, argv=0x00007fffffffea18) at testcase.c:3:5
     1    int main(int argc, char *argv[])
     2    {
  -> 3        return 0;
     4    }
  (lldb) p argc
  Program aborted due to an unhandled Error:
  Error value was Success. (Note: Success values must still be checked prior to being destroyed).

  Thread 1 received signal SIGABRT, Aborted.
  thr_kill () at thr_kill.S:3
  3       thr_kill.S: No such file or directory.
  (gdb) bt
  #0  thr_kill () at thr_kill.S:3
  #1  0x00000008049a0004 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
  #2  0x0000000804916229 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
  #3  0x000000000451b5f5 in fatalUncheckedError () at /usr/src/contrib/llvm-project/llvm/lib/Support/Error.cpp:112
  #4  0x00000000019cf008 in GetEntryPointAddress () at /usr/src/contrib/llvm-project/llvm/include/llvm/Support/Error.h:267
  #5  0x0000000001bccbd8 in ConstructorSetup () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:67
  #6  0x0000000001bcd2c0 in ThreadPlanCallFunction () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:114
  #7  0x00000000020076d4 in InferiorCallMmap () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/Utility/InferiorCallPOSIX.cpp:97
  #8  0x0000000001f4be33 in DoAllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessFreeBSD.cpp:604
  #9  0x0000000001fe51b9 in AllocatePage () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:347
  #10 0x0000000001fe5385 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:383
  #11 0x0000000001974da2 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2301
  #12 CanJIT () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2331
  #13 0x0000000001a1bf3d in Evaluate () at /usr/src/contrib/llvm-project/lldb/source/Expression/UserExpression.cpp:190
  freebsd#14 0x00000000019ce7a2 in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Target/Target.cpp:2372
  freebsd#15 0x0000000001ad784c in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:414
  freebsd#16 0x0000000001ad86ae in DoExecute () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:646
  freebsd#17 0x0000000001a5e3ed in Execute () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandObject.cpp:1003
  freebsd#18 0x0000000001a6c4a3 in HandleCommand () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:1762
  freebsd#19 0x0000000001a6f98c in IOHandlerInputComplete () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2760
  freebsd#20 0x0000000001a90b08 in Run () at /usr/src/contrib/llvm-project/lldb/source/Core/IOHandler.cpp:548
  freebsd#21 0x00000000019a6c6a in ExecuteIOHandlers () at /usr/src/contrib/llvm-project/lldb/source/Core/Debugger.cpp:903
  freebsd#22 0x0000000001a70337 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2946
  freebsd#23 0x0000000001d9d812 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/API/SBDebugger.cpp:1169
  freebsd#24 0x0000000001918be8 in MainLoop () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:675
  freebsd#25 0x000000000191a114 in main () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:890

  Fix the incorrect error catch by only instantiating an Error object
  if it is necessary.

  Reviewed By: JDevlieghere

  Differential Revision: https://reviews.llvm.org/D86355

This should fix lldb aborting as described in the scenario above.

Reported by:	dmgk
PR:		248745
Approved by:	so
Security:	FreeBSD-EN-21:07.lldb

(cherry picked from commit eb41eed)
emaste pushed a commit that referenced this pull request Mar 10, 2022
Interesting fixes:
47661d0 Match libc++abi/libsupc++ when demangling array types
e44a05c Fix unitialized variable in __cxa_demangle_gnu3 after #6 (#8)
5088b05 Remove some code duplication.
fd484be Atomics cleanup (#11)
emaste pushed a commit that referenced this pull request Mar 10, 2022
Interesting fixes:
47661d0 Match libc++abi/libsupc++ when demangling array types
e44a05c Fix unitialized variable in __cxa_demangle_gnu3 after #6 (#8)
5088b05 Remove some code duplication.
fd484be Atomics cleanup (#11)

MFC after:	2 weeks
emaste pushed a commit that referenced this pull request Mar 24, 2022
Interesting fixes:
47661d0 Match libc++abi/libsupc++ when demangling array types
e44a05c Fix unitialized variable in __cxa_demangle_gnu3 after #6 (#8)
5088b05 Remove some code duplication.
fd484be Atomics cleanup (#11)

MFC after:	2 weeks

(cherry picked from commit 56aaed3)
emaste pushed a commit that referenced this pull request Apr 3, 2023
Under certain loads, the following panic is hit:

    panic: page fault
    KDB: stack backtrace:
    #0 0xffffffff805db025 at kdb_backtrace+0x65
    #1 0xffffffff8058e86f at vpanic+0x17f
    #2 0xffffffff8058e6e3 at panic+0x43
    #3 0xffffffff808adc15 at trap_fatal+0x385
    #4 0xffffffff808adc6f at trap_pfault+0x4f
    #5 0xffffffff80886da8 at calltrap+0x8
    #6 0xffffffff80669186 at vgonel+0x186
    #7 0xffffffff80669841 at vgone+0x31
    #8 0xffffffff8065806d at vfs_hash_insert+0x26d
    #9 0xffffffff81a39069 at sfs_vgetx+0x149
    #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4
    #11 0xffffffff8065a28c at lookup+0x45c
    #12 0xffffffff806594b9 at namei+0x259
    #13 0xffffffff80676a33 at kern_statat+0xf3
    freebsd#14 0xffffffff8067712f at sys_fstatat+0x2f
    freebsd#15 0xffffffff808ae50c at amd64_syscall+0x10c
    freebsd#16 0xffffffff808876bb at fast_syscall_common+0xf8

The page fault occurs because vgonel() will call VOP_CLOSE() for active
vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While
here, define vop_open for consistency.

After adding the necessary vop, the bug progresses to the following
panic:

    panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1)
    cpuid = 17
    KDB: stack backtrace:
    #0 0xffffffff805e29c5 at kdb_backtrace+0x65
    #1 0xffffffff8059620f at vpanic+0x17f
    #2 0xffffffff81a27f4a at spl_panic+0x3a
    #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40
    #4 0xffffffff8066fdee at vinactivef+0xde
    #5 0xffffffff80670b8a at vgonel+0x1ea
    #6 0xffffffff806711e1 at vgone+0x31
    #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d
    #8 0xffffffff81a39069 at sfs_vgetx+0x149
    #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4
    #10 0xffffffff80661c2c at lookup+0x45c
    #11 0xffffffff80660e59 at namei+0x259
    #12 0xffffffff8067e3d3 at kern_statat+0xf3
    #13 0xffffffff8067eacf at sys_fstatat+0x2f
    freebsd#14 0xffffffff808b5ecc at amd64_syscall+0x10c
    freebsd#15 0xffffffff8088f07b at fast_syscall_common+0xf8

This is caused by a race condition that can occur when allocating a new
vnode and adding that vnode to the vfs hash. If the newly created vnode
loses the race when being inserted into the vfs hash, it will not be
recycled as its usecount is greater than zero, hitting the above
assertion.

Fix this by dropping the assertion.

FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700
Reviewed-by: Andriy Gapon <[email protected]>
Reviewed-by: Mateusz Guzik <[email protected]>
Reviewed-by: Alek Pinchuk <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Rob Wing <[email protected]>
Co-authored-by: Rob Wing <[email protected]>
Submitted-by: Klara, Inc.
Sponsored-by: rsync.net
Closes #14501
emaste pushed a commit that referenced this pull request Jun 5, 2023
- Allow firmware downloading for hw_variant #8;
- Enter manufacturer mode for setting of event mask;
- Handle multi-event response on HCI commands for 7260;
  This allows to remove kludge with skipping of 0xfc2f opcode.
- Disable patch and exit manufacturer mode on downloading failure;
- Use default firmware if correct firmware file is not found;

Reviewed by:	Philippe Michaud-Boudreault <pitwuu_AT_gmail_DOT_com>
MFC after:	1 week
Tested by:	arrowd
Differential revision:	https://reviews.freebsd.org/D30543

(cherry picked from commit da93a73)
emaste pushed a commit that referenced this pull request Jul 17, 2023
Under certain loads, the following panic is hit:

    panic: page fault
    KDB: stack backtrace:
    #0 0xffffffff805db025 at kdb_backtrace+0x65
    #1 0xffffffff8058e86f at vpanic+0x17f
    #2 0xffffffff8058e6e3 at panic+0x43
    #3 0xffffffff808adc15 at trap_fatal+0x385
    #4 0xffffffff808adc6f at trap_pfault+0x4f
    #5 0xffffffff80886da8 at calltrap+0x8
    #6 0xffffffff80669186 at vgonel+0x186
    #7 0xffffffff80669841 at vgone+0x31
    #8 0xffffffff8065806d at vfs_hash_insert+0x26d
    #9 0xffffffff81a39069 at sfs_vgetx+0x149
    #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4
    #11 0xffffffff8065a28c at lookup+0x45c
    #12 0xffffffff806594b9 at namei+0x259
    #13 0xffffffff80676a33 at kern_statat+0xf3
    freebsd#14 0xffffffff8067712f at sys_fstatat+0x2f
    freebsd#15 0xffffffff808ae50c at amd64_syscall+0x10c
    freebsd#16 0xffffffff808876bb at fast_syscall_common+0xf8

The page fault occurs because vgonel() will call VOP_CLOSE() for active
vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While
here, define vop_open for consistency.

After adding the necessary vop, the bug progresses to the following
panic:

    panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1)
    cpuid = 17
    KDB: stack backtrace:
    #0 0xffffffff805e29c5 at kdb_backtrace+0x65
    #1 0xffffffff8059620f at vpanic+0x17f
    #2 0xffffffff81a27f4a at spl_panic+0x3a
    #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40
    #4 0xffffffff8066fdee at vinactivef+0xde
    #5 0xffffffff80670b8a at vgonel+0x1ea
    #6 0xffffffff806711e1 at vgone+0x31
    #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d
    #8 0xffffffff81a39069 at sfs_vgetx+0x149
    #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4
    #10 0xffffffff80661c2c at lookup+0x45c
    #11 0xffffffff80660e59 at namei+0x259
    #12 0xffffffff8067e3d3 at kern_statat+0xf3
    #13 0xffffffff8067eacf at sys_fstatat+0x2f
    freebsd#14 0xffffffff808b5ecc at amd64_syscall+0x10c
    freebsd#15 0xffffffff8088f07b at fast_syscall_common+0xf8

This is caused by a race condition that can occur when allocating a new
vnode and adding that vnode to the vfs hash. If the newly created vnode
loses the race when being inserted into the vfs hash, it will not be
recycled as its usecount is greater than zero, hitting the above
assertion.

Fix this by dropping the assertion.

FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700
Reviewed-by: Andriy Gapon <[email protected]>
Reviewed-by: Mateusz Guzik <[email protected]>
Reviewed-by: Alek Pinchuk <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Rob Wing <[email protected]>
Co-authored-by: Rob Wing <[email protected]>
Submitted-by: Klara, Inc.
Sponsored-by: rsync.net
Closes #14501
emaste pushed a commit that referenced this pull request Jul 31, 2023
Avoid locking issues when if_allmulti() calls the driver's if_ioctl,
because that may acquire sleepable locks (while we hold a non-sleepable
rwlock).

Fortunately there's no pressing need to hold the mroute lock while we
do this, so we can postpone the call slightly, until after we've
released the lock.

This avoids the following WITNESS warning (with iflib drivers):

	lock order reversal: (sleepable after non-sleepable)
	 1st 0xffffffff82f64960 IPv4 multicast forwarding (IPv4 multicast forwarding, rw) @ /usr/src/sys/netinet/ip_mroute.c:1050
	 2nd 0xfffff8000480f180 iflib ctx lock (iflib ctx lock, sx) @ /usr/src/sys/net/iflib.c:4525
	lock order IPv4 multicast forwarding -> iflib ctx lock attempted at:
	#0 0xffffffff80bbd6ce at witness_checkorder+0xbbe
	#1 0xffffffff80b56d10 at _sx_xlock+0x60
	#2 0xffffffff80c9ce5c at iflib_if_ioctl+0x2dc
	#3 0xffffffff80c7c395 at if_setflag+0xe5
	#4 0xffffffff82f60a0e at del_vif_locked+0x9e
	#5 0xffffffff82f5f0d5 at X_ip_mrouter_set+0x265
	#6 0xffffffff80bfd402 at sosetopt+0xc2
	#7 0xffffffff80c02105 at kern_setsockopt+0xa5
	#8 0xffffffff80c02054 at sys_setsockopt+0x24
	#9 0xffffffff81046be8 at amd64_syscall+0x138
	#10 0xffffffff8101930b at fast_syscall_common+0xf8

See also:	https://redmine.pfsense.org/issues/12079
Reviewed by:	mjg
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D41209
emaste pushed a commit that referenced this pull request Sep 16, 2023
netlink(4) calls back into the driver during detach and it attempts to
start an internal synchronized op recursively, causing an interruptible
hang.  Fix it by failing the ioctl if the VI has been marked as DOOMED
by cxgbe_detach.

Here's the stack for the hang for reference.
 #6  begin_synchronized_op
 #7  cxgbe_media_status
 #8  ifmedia_ioctl
 #9  cxgbe_ioctl
 #10 if_ioctl
 #11 get_operstate_ether
 #12 get_operstate
 #13 dump_iface
 freebsd#14 rtnl_handle_ifevent
 freebsd#15 rtnl_handle_ifnet_event
 freebsd#16 rt_ifmsg
 freebsd#17 if_unroute
 freebsd#18 if_down
 freebsd#19 if_detach_internal
 freebsd#20 if_detach
 freebsd#21 ether_ifdetach
 freebsd#22 cxgbe_vi_detach
 freebsd#23 cxgbe_detach
 freebsd#24 DEVICE_DETACH

MFC after:	3 days
Sponsored by:	Chelsio Communications
emaste pushed a commit that referenced this pull request Oct 13, 2023
netlink(4) calls back into the driver during detach and it attempts to
start an internal synchronized op recursively, causing an interruptible
hang.  Fix it by failing the ioctl if the VI has been marked as DOOMED
by cxgbe_detach.

Here's the stack for the hang for reference.
 #6  begin_synchronized_op
 #7  cxgbe_media_status
 #8  ifmedia_ioctl
 #9  cxgbe_ioctl
 #10 if_ioctl
 #11 get_operstate_ether
 #12 get_operstate
 #13 dump_iface
 freebsd#14 rtnl_handle_ifevent
 freebsd#15 rtnl_handle_ifnet_event
 freebsd#16 rt_ifmsg
 freebsd#17 if_unroute
 freebsd#18 if_down
 freebsd#19 if_detach_internal
 freebsd#20 if_detach
 freebsd#21 ether_ifdetach
 freebsd#22 cxgbe_vi_detach
 freebsd#23 cxgbe_detach
 freebsd#24 DEVICE_DETACH

MFC after:	3 days
Sponsored by:	Chelsio Communications

(cherry picked from commit 3814249)
emaste pushed a commit that referenced this pull request Oct 25, 2023
netlink(4) calls back into the driver during detach and it attempts to
start an internal synchronized op recursively, causing an interruptible
hang.  Fix it by failing the ioctl if the VI has been marked as DOOMED
by cxgbe_detach.

Here's the stack for the hang for reference.
 #6  begin_synchronized_op
 #7  cxgbe_media_status
 #8  ifmedia_ioctl
 #9  cxgbe_ioctl
 #10 if_ioctl
 #11 get_operstate_ether
 #12 get_operstate
 #13 dump_iface
 freebsd#14 rtnl_handle_ifevent
 freebsd#15 rtnl_handle_ifnet_event
 freebsd#16 rt_ifmsg
 freebsd#17 if_unroute
 freebsd#18 if_down
 freebsd#19 if_detach_internal
 freebsd#20 if_detach
 freebsd#21 ether_ifdetach
 freebsd#22 cxgbe_vi_detach
 freebsd#23 cxgbe_detach
 freebsd#24 DEVICE_DETACH

Sponsored by:	Chelsio Communications
Approved by:	re (kib)

(cherry picked from commit 3814249)
(cherry picked from commit 3287f64)
emaste pushed a commit that referenced this pull request Nov 19, 2024
Avoid calling _callout_stop_safe with a non-sleepable lock held when
detaching by initializing callout_init_rw() with CALLOUT_SHAREDLOCK.

It avoids the following WITNESS warning when stopping the service:

    # service ipfilter stop
    calling _callout_stop_safe with the following non-sleepable locks held:
    shared rw ipf filter load/unload mutex (ipf filter load/unload mutex) r = 0 (0xffff0000417c7530) locked @ /usr/src/sys/netpfil/ipfilter/netinet/fil.c:7926
    stack backtrace:
    #0 0xffff00000052d394 at witness_debugger+0x60
    #1 0xffff00000052e620 at witness_warn+0x404
    #2 0xffff0000004d4ffc at _callout_stop_safe+0x8c
    #3 0xffff0000f7236674 at ipfdetach+0x3c
    #4 0xffff0000f723fa4c at ipf_ipf_ioctl+0x788
    #5 0xffff0000f72367e0 at ipfioctl+0x144
    #6 0xffff00000034abd8 at devfs_ioctl+0x100
    #7 0xffff0000005c66a0 at vn_ioctl+0xbc
    #8 0xffff00000034b2cc at devfs_ioctl_f+0x24
    #9 0xffff0000005331ec at kern_ioctl+0x2e0
    #10 0xffff000000532eb4 at sys_ioctl+0x140
    #11 0xffff000000880480 at do_el0_sync+0x604
    #12 0xffff0000008579ac at handle_el0_sync+0x4c

PR:		282478
Suggested by:	markj
Reviewed by:	cy
Approved by:	emaste (mentor)
MFC after:	1 week
emaste pushed a commit that referenced this pull request May 1, 2025
Because we do not drain the hdac callout during detach, hot-unloading
can result in a panic if the callback fires after we have freed the
resources it uses.

\# kldunload snd_hda
pcm0: detached
pcm1: detached
hdaa0: detached
hdacc0: detached
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex hdac0 (HDA driver mutex) r = 0 (0xfffffe000212c820) locked @ /mnt/src/sys/dev/sound/pci/hda/hdac.c:400
stack backtrace:
\#0 0xffffffff811c97df at witness_debugger+0x13f
\#1 0xffffffff811cb364 at witness_warn+0x674
\#2 0xffffffff81a212c6 at trap_pfault+0x116
\#3 0xffffffff81a2023c at trap+0x54c
\#4 0xffffffff819dc7f8 at calltrap+0x8
\#5 0xffffffff8412d05c at hdac_intr_handler+0x15c
\#6 0xffffffff81066e17 at ithread_loop+0x387
\#7 0xffffffff81060e93 at fork_exit+0xa3
\#8 0xffffffff819dd85e at fork_trampoline+0xe

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x180
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff8412e7ec
stack pointer           = 0x0:0xfffffe0046b04d20
frame pointer           = 0x0:0xfffffe0046b04d40
code segment            = base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq59: hdac0)
rdi: fffffe005d38e288 rsi: ffffffff8413b1c8 rdx: 0000000000000030
rcx: 0000000000000000  r8: 0000000000000001  r9: 0000000000000002
rax: fffffe00024cc800 rbx: 0000000000000002 rbp: fffffe0046b04d40
r10: 3433313533372e30 r11: 0000000000000006 r12: fffffe005d38e200
r13: 0000000000000005 r14: 0000000000000001 r15: fffffe005d38e500
trap number             = 12
panic: page fault
cpuid = 1
time = 1745701580
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0xa5/frame 0xfffffe0046b04490
kdb_backtrace() at kdb_backtrace+0xc6/frame 0xfffffe0046b045f0
vpanic() at vpanic+0x226/frame 0xfffffe0046b04790
panic() at panic+0xb5/frame 0xfffffe0046b04850
trap_fatal() at trap_fatal+0x65b/frame 0xfffffe0046b04950
trap_pfault() at trap_pfault+0x12b/frame 0xfffffe0046b04a70
trap() at trap+0x54c/frame 0xfffffe0046b04c50
calltrap() at calltrap+0x8/frame 0xfffffe0046b04c50
--- trap 0xc, rip = 0xffffffff8412e7ec, rsp = 0xfffffe0046b04d20, rbp = 0xfffffe0046b04d40 ---
hdacc_stream_intr() at hdacc_stream_intr+0x3c/frame 0xfffffe0046b04d40
hdac_intr_handler() at hdac_intr_handler+0x15c/frame 0xfffffe0046b04d90
ithread_loop() at ithread_loop+0x387/frame 0xfffffe0046b04ef0
fork_exit() at fork_exit+0xa3/frame 0xfffffe0046b04f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0046b04f30
--- trap 0xc, rip = 0x829cc531a, rsp = 0x82b8b7a88, rbp = 0x82b8b7aa0 ---
KDB: enter: panic
[ thread pid 12 tid 100211 ]
Stopped at      kdb_enter+0x34: movq    $0,0x1f09af1(%rip)
db>

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week

(cherry picked from commit 8d69f23)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants