On Wed, Aug 14, 2024 at 01:20:21AM +0300, Cristian Ciocaltea wrote:
Commit 3d59c22bbb8d ("drm/rockchip: vop2: Convert to use maple tree
register cache") enabled the use of maple tree register cache in
Rockchip VOP2 driver. However, building the kernel with lockdep support
indicates locking rules violation when trying to unload the rockchipdrm
module:
[ 48.360258] ========================================================
[ 48.360829] WARNING: possible irq lock inversion dependency detected
[ 48.361400] 6.11.0-rc1 #40 Not tainted
[ 48.361743] --------------------------------------------------------
[ 48.362311] modprobe/685 just changed the state of lock:
[ 48.362790] ffff0000087fa798 (&mt->ma_lock){+...}-{2:2}, at: regcache_maple_exit+0x6c/0xe0
Please think hard before including complete backtraces in upstream
reports, they are very large and contain almost no useful information
relative to their size so often obscure the relevant content in your
message. If part of the backtrace is usefully illustrative (it often is
for search engines if nothing else) then it's usually better to pull out
the relevant sections.
The problem is that the regmap lock could be taken by an IRQ context,
interrupting the irq-unsafe maple tree lock, which may result in a lock
inversion deadlock scenario.
Switch to use irq-safe locking in the maple tree register cache.
I'd have a bigger question here which is why the driver is using a
dynamically allocated register cache in a hardirq context, especially
with no defaults provided? Anything except the flat cache might do
allocations at runtime which might include in interrupt context unless
the caller is very careful and since the lockdep warning triggered it's
clear that this driver isn't. The core will be doing atomic allocations
for MMIO but that's not something we want to be doing as a matter of
course... I would generally expect drivers to try to ensure that any
registers are cached outside of the interrupt handler, usually by
specifying defaults or touching all registers during setup.
Without having done a full analysis it also looks like the marking of
volatile registers isn't right, it's not immediately clear that the
interrupt status and clear registers are volatile and they ought to be.
None of the registers accessed in interrupt context look like they
should be cached at all unless there's something triggered via the DRM
vblank calls.
It might be safer to fall back to the rbtree cache for this device since
rbtree doesn't force an extra level of locking on us, though like I say
I'm not convinced that what the driver is doing with caching is a superThe registers of VOP are quite special: Each write operation to the register
good idea. Though probably what the driver is doing should work.