Re: [PATCH stable] notifiers: Add oops check in blocking_notifier_call_chain()
From: yiyang (D)
Date: Tue Nov 11 2025 - 21:46:19 EST
On 2025/10/22 11:36, yiyang (D) wrote:
On 2025/10/18 6:25, Andrew Morton wrote:Do you think it is necessary to merge this patch into the 6.6 stable branch (or earlier versions)?
On Fri, 17 Oct 2025 06:17:40 +0000 Yi Yang <yiyang13@xxxxxxxxxx> wrote:Below is an excerpt from the original error message:
In hrtimer_interrupt(), interrupts are disabled when acquiring a spinlock,
which subsequently triggers an oops. During the oops call chain,
blocking_notifier_call_chain() invokes _cond_resched, ultimately leading
to a hard lockup.
Call Stack:
hrtimer_interrupt//raw_spin_lock_irqsave
__hrtimer_run_queues
page_fault
do_page_fault
bad_area_nosemaphore
no_context
oops_end
bust_spinlocks
unblank_screen
do_unblank_screen
fbcon_blank
fb_notifier_call_chain
blocking_notifier_call_chain
down_read
_cond_resched
Seems this trace is upside-down relative to what we usually see.
Is the unaltered dmesg output available?
#0[ffff8a317f6c3ac0] __cond_resched at ffffffffa10d29a6
#1[ffff8a317f6c3ad8] _cond_resched at ffffffffa17292cf
#2[ffff8a317f6c3ae8] down_read at ffffffffa1728022
#3[ffff8a317f6c3b00] __blocking_notifier_call_chain at ffffffffa10c5c37
#4[ffff8a317f6c3b40] blocking_notifier_call_chain at ffffffffa10c5c86
#5[ffff8a317f6c3b50] fb_notifier_call_chain at ffffffffa13c83eb
#6[ffff8a317f6c3b60] fb_blank at ffffffffa13c88eb
#7[ffff8a317f6c3ba0] fbcon_blank at ffffffffa13d4a4b
#8[ffff8a317f6c3ca0] do_unblank_screen at ffffffffa144cb30
#9[ffff8a317f6c3cc0] unblank_screen at ffffffffa144cbf0
#10[ffff8a317f6c3ce0] oops_end at ffffffffa172d6d5
#11[ffff8a317f6c3d08] no_context at ffffffffa171cebc
#12[ffff8a317f6c3d58] __bad_area_nosemaphore at ffffffffa171cf53
#13[ffff8a317f6c3da8] bad_area_nosemaphore at ffffffffa171d0c4
#14[ffff8a317f6c3db8] __do_page_fault at ffffffffa17306b0
#15[ffff8a317f6c3e20] do_page_fault at ffffffffa1730895
#16[ffff8a317f6c3e50] page_fault at ffffffffa172c768
fb_notifier_call_chain() is called in both the fb_blank() and fb_set_var() functions, and it is not only called when defined(CONFIG_GUMSTIX_AM200EPD).If the system is in an oops state, use down_read_trylock instead of a
blocking lock acquisition. If the trylock fails, skip executing the
notifier callbacks to avoid potential deadlocks or unsafe operations
during the oops handling process.
...
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -384,9 +384,18 @@ int blocking_notifier_call_chain(struct blocking_notifier_head *nh,
* is, we re-check the list after having taken the lock anyway:
*/
if (rcu_access_pointer(nh->head)) {
- down_read(&nh->rwsem);
- ret = notifier_call_chain(&nh->head, val, v, -1, NULL);
- up_read(&nh->rwsem);
+ if (!oops_in_progress) {
+ down_read(&nh->rwsem);
+ ret = notifier_call_chain(&nh->head, val, v, -1, NULL);
+ up_read(&nh->rwsem);
+ } else {
+ if (down_read_trylock(&nh->rwsem)) {
+ ret = notifier_call_chain(&nh->head, val, v, -1, NULL);
+ up_read(&nh->rwsem);
+ } else {
+ ret = NOTIFY_BAD;
+ }
+ }
}
return ret;
Am I correct in believing that fb_notifier_call_chain() is only ever
called if defined(CONFIG_GUMSTIX_AM200EPD)?
I wonder what that call is for, and if we can simply remove it.The function called when an issue occurs is `fb_notifier_call_chain(FB_EVENT_BLANK, &event);`.
The purpose of this function is to invoke the notification chain that has registered for the FB_EVENT_BLANK event.
The FB_EVENT_BLANK event appears to indicate a screen-related state.
.
Currently, when an oops occurs, the actual panic stack trace is not being printed because it is being blocked by the notification chain.