Re: [PATCH v4 2/6] perf: Enqueue SIGTRAP always via task_work.
From: Lai, Yi
Date: Tue Dec 03 2024 - 22:03:16 EST
On Mon, Nov 11, 2024 at 01:08:57PM +0100, Sebastian Andrzej Siewior wrote:
> On 2024-11-08 23:26:36 [+0100], Frederic Weisbecker wrote:
> > > Please see
> > > https://lore.kernel.org/all/1440816150.8932.123.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> > > and the whole thread.
>
> Thank you for this Oleg.
>
> > > I don't think raw_spin_lock_irq + cmpxchg for each work is a good
> > > idea, but quite possibly I misunderstood this change.
> >
> > I did not realize there could be gazillion files released in a row. So there
> > could be noticeable performance issues I guess...
>
> I made a testcase to open 2M (2 *10^6) files and then exit. This led
> task_work_run() run 2M + 3 callbacks (+ stdin/out/err) for the task.
>
> Running 70 samples on the "orig" kernel:
> - avg callback time 1.156.470,3 us
> - 63 samples are starting with 11 (1,1 sec) avg: 1.128.046,7 us
> - 6 samples are starting with 14 (1,4 sec) avg: 1.435.294,8us
>
> Running 70 samples on the "patched" kernel:
> - avg callback time 1.278.938,8 us
> - 59 samples are starting with 12 (1,2 sec) avg: 1.230.189,1 us
> - 10 samples are starting with 15 (1,5sec) avg: 1.555.934,5 us
>
> With the extra lock the task_work_run() runtime extends by approximately
> ~122ms for the 2M invoked callbacks.
> The spike 1,1sec -> 1,4sec or 1,2sec -> 1,5 sec is due to context
> switching (there are few cond_resched()/ might_sleep()).
>
> It is not that bad, is it?
>
Hi,
Do we reach consensus about the fix? The issue can still be reproduced
using v6.13-rc1 kernel. Call trace:
[ 300.429498] ? lock_acquire+0x80/0xb0
[ 300.429695] ? schedule+0x216/0x3f0
[ 300.429888] schedule+0xf6/0x3f0
[ 300.430068] _free_event+0x531/0x14c0
[ 300.430277] perf_event_release_kernel+0x648/0x870
[ 300.430680] ? __pfx_perf_event_release_kernel+0x10/0x10
[ 300.430969] ? __this_cpu_preempt_check+0x21/0x30
[ 300.431228] ? __sanitizer_cov_trace_const_cmp2+0x1c/0x30
[ 300.431519] ? __pfx_perf_release+0x10/0x10
[ 300.431746] perf_release+0x3a/0x50
[ 300.431940] __fput+0x414/0xb60
[ 300.432127] ____fput+0x22/0x30
[ 300.432303] task_work_run+0x19c/0x2b0
[ 300.432518] ? __pfx_task_work_run+0x10/0x10
[ 300.432752] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[ 300.433038] ? switch_task_namespaces+0xc6/0x110
[ 300.433295] do_exit+0xb19/0x2a30
[ 300.433484] ? lockdep_hardirqs_on+0x89/0x110
[ 300.433728] ? __pfx_do_exit+0x10/0x10
[ 300.433942] do_group_exit+0xe4/0x2c0
[ 300.434147] get_signal+0x2279/0x24c0
[ 300.434360] ? __pfx_get_signal+0x10/0x10
[ 300.434601] ? __might_fault+0xf1/0x1b0
[ 300.434824] arch_do_signal_or_restart+0x8e/0x7d0
[ 300.435083] ? __pfx_arch_do_signal_or_restart+0x10/0x10
[ 300.435375] ? __this_cpu_preempt_check+0x21/0x30
[ 300.435629] ? syscall_exit_to_user_mode+0x10f/0x200
[ 300.435901] syscall_exit_to_user_mode+0x144/0x200
[ 300.436161] do_syscall_64+0x79/0x140
[ 300.436366] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 300.436643] RIP: 0033:0x7f5add83ee5d
[ 300.436843] RSP: 002b:00007fff21a70e68 EFLAGS: 00000206 ORIG_RAX: 000000000000012a
[ 300.437240] RAX: 0000000000000003 RBX: 0000000000000000 RCX: 00007f5add83ee5d
[ 300.437610] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000080
[ 300.437969] RBP: 00007fff21a70e70 R08: 000000000000000b R09: 00007fff21a70ea0
[ 300.438329] R10: 00000000ffffffff R11: 0000000000000206 R12: 00007fff21a70fc8
[ 300.438703] R13: 0000000000401b4f R14: 0000000000403e08 R15: 00007f5addc0e000
[ 300.439075] </TASK>
[ 300.439197]
[ 300.439197] Showing all locks held in the system:
[ 300.439513] 1 lock held by khungtaskd/33:
[ 300.439723] #0: ffffffff8705ca40 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x73/0x3c0
[ 300.440199]
[ 300.440288] =============================================
[ 300.440288]
Regards,
Yi Lai
> > Thanks.
> > >
> > > Oleg.
>
> Sebastian