Re: lockdep: holding locks across syscall boundaries

From: Jens Axboe
Date: Fri Oct 27 2023 - 12:06:41 EST

Next message: Mark Brown: "Re: [PATCH 4/4] spi: spi-fsl-qspi: Add support for rx data sample point adjustment"
Previous message: Guenter Roeck: "Re: [PATCH v6 1/2] regulator (max5970): Add hwmon support"
In reply to: Peter Zijlstra: "Re: lockdep: holding locks across syscall boundaries"
Next in thread: Peter Zijlstra: "Re: lockdep: holding locks across syscall boundaries"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 10/27/23 9:59 AM, Peter Zijlstra wrote:
> On Fri, Oct 27, 2023 at 09:14:53AM -0600, Jens Axboe wrote:
>> Hi,
>>
>> Normally we'd expect locking state to be clean and consistent across
>> syscall entry and exit, as that is always the case for sync syscalls.
>
>> We currently have a work-around for holding a lock from aio, see
>> kiocb_start_write(), which pretends to drop the lock from lockdeps
>> perspective, as it's held from submission to until kiocb_end_write() is
>> called at completion time.
>
> I was not aware of this, the only such hack I knew about was the
> filesystem freezer thing.
>
> The problem with holding locks past the end of a syscall is that you'll
> nest whatever random lock hierarchies possibly by every other syscall
> under that lock.

Can you expand on that bit, not quite sure I follow. Do we reset the
locking dependencies between syscalls?

>> This is a bit of an ugly work-around, and defeats the purpose of
>> lockdep.
>>
>> Since I've now got another case where I want to hold a resource across
>> syscalls, is there a better way to do this?
>>
>> This is for inode_dio_start(), which increments an inode int count, and
>> inode_dio_end() which decrements it. If a task is doing
>> inode_dio_start() and then inode_dio_wait(), I want to trigger this. I
>> have a hack that does this, but it disables lockdep_sys_exit() as
>> otherwise I just get that warning rather than the more useful one.
>
> Suppose syscall-a returns with your kiocb thing held, call it lock A
> Suppose syscall-b returns with your inode thing held, call it lock B
>
> Then userspace does:
>
> syscall-a
> syscall-b
>
> while it also does:
>
> syscall-b
> syscall-a
>
> and we're up a creek, no?

Should this not get caught by the usual lock ordering rules? Because
that is obviously a bug, ordering would have to be consistent, just like
if we have:

syscall-a
lock(a);
lock(b);

syscall-b
lock(b);
lock(a)

--
Jens Axboe

Next message: Mark Brown: "Re: [PATCH 4/4] spi: spi-fsl-qspi: Add support for rx data sample point adjustment"
Previous message: Guenter Roeck: "Re: [PATCH v6 1/2] regulator (max5970): Add hwmon support"
In reply to: Peter Zijlstra: "Re: lockdep: holding locks across syscall boundaries"
Next in thread: Peter Zijlstra: "Re: lockdep: holding locks across syscall boundaries"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]