Re: [syzbot] [kernfs?] possible deadlock in kernfs_fop_llseek

From: Amir Goldstein
Date: Fri Apr 05 2024 - 06:34:38 EST


On Fri, Apr 5, 2024 at 1:01 AM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Apr 04, 2024 at 12:33:40PM +0300, Amir Goldstein wrote:
>
> > This specifically cannot happen because sysfs is not allowed as an
> > upper layer only as a lower layer, so overlayfs itself will not be writing to
> > /sys/power/resume.
>
> Then how could you possibly get a deadlock there? What would your minimal
> deadlocked set look like?
>
> 1. Something is blocked in lookup_bdev() called from resume_store(), called
> from sysfs_kf_write(), called from kernfs_write_iter(), which has acquired
> ->mutex of struct kernfs_open_file that had been allocated by
> kernfs_fop_open() back when the file had been opened. Note that each
> struct file instance gets a separate struct kernfs_open_file. Since we are
> calling ->write_iter(), the file *MUST* have been opened for write.
>
> 2. Something is blocked in kernfs_fop_llseek() on the same of->mutex,
> i.e. using the same struct file as (1). That something is holding an
> overlayfs inode lock, which is what the next thread is blocked on.
>
> + at least one more thread, to complete the cycle.
>
> Right? How could that possibly happen without overlayfs opening /sys/power/resume
> for write? Again, each struct file instance gets a separate of->mutex;
> for a deadlock you need a cycle of threads and a cycle of locks, such
> that each thread is holding the corresponding lock and is blocked on
> attempt to get the lock that comes next in the cyclic order.

Absolutely right.
I had it in my mind that this was a node lock. Did not look closely.

>
> If overlayfs never writes to that sucker, it can't participate in that
> cycle. Sure, you can get overlayfs llseek grabbing of->mutex of *ANOTHER*
> struct file opened for the same sysfs file. Since it's not the same
> struct file and since each struct file there gets a separate kernfs_open_file
> instance, the mutex won't be the same.
>
> Unless I'm missing something else, that can't deadlock. For a quick and
> dirty experiment, try to give of->mutex on r/o opens a class separate from
> that on r/w and w/o opens (mutex_init() in kernfs_fop_open()) and see
> if lockdep warnings persist.
>
> Something like
>
> if (has_mmap)
> mutex_init(&of->mutex);
> else if (file->f_mode & FMODE_WRITE)
> mutex_init(&of->mutex);
> else
> mutex_init(&of->mutex);

Why a quick experiment?
Why not a permanent kludge?

It is not any better or worse than the already existing has_mmap
subclass annotation. huh?

Thanks,
Amir.