Re: [PATCH] mm: drop mmap_sem before calling balance_dirty_pages() in write fault
From: Johannes Weiner
Date: Tue Sep 24 2019 - 17:43:42 EST
On Tue, Sep 24, 2019 at 01:46:08PM -0700, Matthew Wilcox wrote:
> On Tue, Sep 24, 2019 at 03:42:38PM -0400, Johannes Weiner wrote:
> > > I'm not a fan of moving file_update_time() to _before_ the
> > > balance_dirty_pages call.
> >
> > Can you elaborate why? If the filesystem has a page_mkwrite op, it
> > will have already called file_update_time() before this function is
> > entered. If anything, this change makes the sequence more consistent.
>
> Oh, that makes sense. I thought it should be updated after all the data
> was written, but it probably doesn't make much difference.
>
> > > Also, this is now the third place that needs
> > > maybe_unlock_mmap_for_io, see
> > > https://lore.kernel.org/linux-mm/20190917120852.x6x3aypwvh573kfa@box/
> >
> > Good idea, I moved the helper to internal.h and converted to it.
> >
> > I left the shmem site alone, though. It doesn't require the file
> > pinning, so it shouldn't pointlessly bump the file refcount and
> > suggest such a dependency - that could cost somebody later quite a bit
> > of time trying to understand the code.
>
> The problem for shmem is this:
>
> spin_unlock(&inode->i_lock);
> schedule();
>
> spin_lock(&inode->i_lock);
> finish_wait(shmem_falloc_waitq, &shmem_fault_wait);
> spin_unlock(&inode->i_lock);
>
> While scheduled, the VMA can go away and the inode be reclaimed, making
> this a use-after-free. The initial suggestion was an increment on
> the inode refcount, but since we already have a pattern which involves
> pinning the file, I thought that was a better way to go.
I completely read over the context of that email you linked - that
there is a bug in the existing code - and looked at it as mere
refactoring patch. My apologies.
Switching that shmem site to maybe_unlock_mmap_for_io() to indirectly
pin the inode (in a separate bug fix patch) indeed makes sense to me.
> > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
>
> Reviewed-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
Thanks, Matthew.