Re: [PATCH printk v2 00/18] add threaded printing + the rest

From: Juri Lelli
Date: Tue Jun 04 2024 - 09:31:58 EST


Hi John,

On 04/06/24 01:30, John Ogness wrote:
> Hi,
>
> This is v2 of a series to implement threaded console printing as well
> as some other minor pieces (such as proc and sysfs support). This
> series is only a subset of the original v1 [0]. In particular, this
> series represents patches 11, 12, 15 of the v1 series. For information
> about the motivation of the nbcon consoles, please read the cover
> letter of v1.
>
> This series provides the remaining pieces of the printk rework. All
> other components are either already mainline or are currently in
> linux-next. In particular this series does:

Our QE reported something like the following while testing the latest
rt-devel branch (I then could reproduce with this set applied on top of
linux-next).

---
... kernel: INFO: task khugepaged:351 blocked for more than 1 seconds.
... kernel: Not tainted 6.9.0-thrdprintk+ #3
... kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
... kernel: task:khugepaged state:D stack:0 pid:351 tgid:351 ppid:2 flags:0x00004000
... kernel: Call Trace:
... kernel: <TASK>
... kernel: __schedule+0x2bd/0x7f0
... kernel: ? __lock_release.isra.0+0x5e/0x170
... kernel: schedule+0x3d/0x100
... kernel: schedule_timeout+0x1ca/0x1f0
... kernel: ? mark_held_locks+0x49/0x80
... kernel: ? _raw_spin_unlock_irq+0x24/0x50
... kernel: ? lockdep_hardirqs_on+0x77/0x100
... kernel: __wait_for_common+0xb7/0x220
... kernel: ? __pfx_schedule_timeout+0x10/0x10
... kernel: __flush_work+0x70/0x90
... kernel: ? __pfx_wq_barrier_func+0x10/0x10
... kernel: __lru_add_drain_all+0x179/0x210
... kernel: khugepaged+0x73/0x200
... kernel: ? lockdep_hardirqs_on+0x77/0x100
... kernel: ? _raw_spin_unlock_irqrestore+0x38/0x60
... kernel: ? __pfx_khugepaged+0x10/0x10
... kernel: kthread+0xec/0x120
... kernel: ? __pfx_kthread+0x10/0x10
... kernel: ret_from_fork+0x2d/0x50
... kernel: ? __pfx_kthread+0x10/0x10
... kernel: ret_from_fork_asm+0x1a/0x30
... kernel: </TASK>
... kernel:
... Showing all locks held in the system:
... kernel: 1 lock held by khungtaskd/345:
... kernel: #0: ffffffff8cbff1c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x32/0x1d0
... kernel: BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/345
... kernel: caller is nbcon_get_cpu_emergency_nesting+0x25/0x40
... kernel: CPU: 30 PID: 345 Comm: khungtaskd Kdump: loaded Not tainted 6.9.0-thrdprintk+ #3
... kernel: Hardware name: Dell Inc. PowerEdge R740/04FC42, BIOS 2.10.2 02/24/2021
... kernel: Call Trace:
... kernel: <TASK>
... kernel: dump_stack_lvl+0x7f/0xa0
... kernel: check_preemption_disabled+0xbf/0xe0
... kernel: nbcon_get_cpu_emergency_nesting+0x25/0x40
... kernel: nbcon_cpu_emergency_flush+0xa/0x60
... kernel: debug_show_all_locks+0x9d/0x1d0
... kernel: check_hung_uninterruptible_tasks+0x4f0/0x540
... kernel: ? check_hung_uninterruptible_tasks+0x185/0x540
... kernel: ? __pfx_watchdog+0x10/0x10
... kernel: watchdog+0x99/0xa0
... kernel: kthread+0xec/0x120
... kernel: ? __pfx_kthread+0x10/0x10
... kernel: ret_from_fork+0x2d/0x50
... kernel: ? __pfx_kthread+0x10/0x10
... kernel: ret_from_fork_asm+0x1a/0x30
... kernel: </TASK>
---

It requires DEBUG_PREEMPT and LOCKDEP enabled, sched_rt_runtime_us = -1
and a while(1) loop running at FIFO for some time (I also set sysctl
kernel.hung_task_timeout_secs=1 to speed up reproduction).

Looks like check_hung_uninterruptible_tasks() requires some care as you
did already in linux-next for panic, rcu and lockdep ("Make emergency
sections ...")?

Thanks,
Juri