Re: [ BUG: Invalid wait context ] rtc_lock at: mc146818_avoid_UIP

From: Mateusz Jończyk
Date: Thu Apr 03 2025 - 19:53:59 EST


W dniu 3.04.2025 o 15:12, Thomas Gleixner pisze:
On Sun, Mar 30 2025 at 13:32, Borislav Petkov wrote:
while playing with suspend to RAM, I got this lockdep splat below.

Poking around I found:

ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP")

which is doing this funky taking and dropping the rtc_lock and I guess that's
inherited from ye olde times.

I "fixed" it so lockdeup doesn't warn by converting rtc_lock to a raw spinlock
but this is definitely not the right fix so let me bounce it off to the folks
on Cc who might have a better idea perhaps...
Converting it to a raw lock "fixes" the problem, but RT people will hunt
you down with a big latency bat.

But this is not related to the commit above and not new.

timekeeping_suspend() has always invoked mach_get_cmos_time() with the
freeze lock held and mc146818_get_time() has always locked rtc_lock.

I wonder, why this splat hasn't popped before. On RT lockdep should have
complained forever. Sebastian???

Hello,

I was able to trigger the bug on Linux 6.1.0 with
CONFIG_PROVE_RAW_LOCK_NESTING enabled and I suspect it can be
triggered much earlier.

It is likely that only after

commit 560af5dc839eef ("lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING.")

people are seeing this simply because no one previously did
the test with CONFIG_PROVE_RAW_LOCK_NESTING=y.

See https://lore.kernel.org/all/CAP-bSRZ0CWyZZsMtx046YV8L28LhY0fson2g4EqcwRAVN1Jk+Q@xxxxxxxxxxxxxx/ :

This splat happens on suspend/resume on a HP laptop. It doesn't appear
to be a recent regression, as a bisect only leads to 560af5dc839e
("lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING.") -

Ccing Chris Bainbridge <chris.bainbridge@xxxxxxxxx>, author of the previous bug report.

Greetings,

Mateusz