Re: KASAN: use-after-free Read in path_lookupat

From: Jan Kara
Date: Thu Mar 28 2019 - 05:00:51 EST


On Wed 27-03-19 18:59:48, Al Viro wrote:
> On Wed, Mar 27, 2019 at 05:58:31PM +0100, Jan Kara wrote:
> > On Tue 26-03-19 04:15:10, Al Viro wrote:
> > > On Mon, Mar 25, 2019 at 08:18:25PM -0700, Mark Fasheh wrote:
> > >
> > > > Hey Al,
> > > >
> > > > It's been a while since I've looked at that bit of code but it looks like
> > > > Ocfs2 is syncing the inode to disk and disposing of it's memory
> > > > representation (which would include the cluster locks held) so that other
> > > > nodes get a chance to delete the potentially orphaned inode. In Ocfs2 we
> > > > won't delete an inode if it exists in another nodes cache.
> > >
> > > Wait a sec - what's the reason for forcing that write_inode_now(); why
> > > doesn't the normal mechanism work? I'm afraid I still don't get it -
> > > we do wait for writeback in evict_inode(), or the local filesystems
> > > wouldn't work.
> >
> > I'm just guessing here but they don't want an inode cached once its last
> > dentry goes away (it makes cluster wide synchronization easier for them and
> > they do play tricks with cluster lock on dentries).
>
> Sure, but that's as simple as "return 1 from ->drop_inode()".

Right.

> > There is some info in
> > 513e2dae9422 "ocfs2: flush inode data to disk and free inode when i_count
> > becomes zero" which adds this ocfs2_drop_inode() implementation. So when
> > the last inode reference is dropped, they want to flush any dirty data to
> > disk and evict the inode. But AFAICT they should be fine with flushing the
> > inode from their ->evict_inode method. I_FREEING just stops the flusher
> > thread from touching the inode but explicit writeback through
> > write_inode_now(inode, 1) should go through just fine.
>
> Umm... Why is that write_inode_now() needed in either place? I agree that
> moving it to ->evict_inode() ought to be safe, but what makes it necessary
> in the first place? Put it another way, what dirties the data and/or
> metadata without marking it dirty?

Well, the inode & pages are marked dirty and they are dirty when we get to
iput_final(). But if ->drop_inode() returns 1 (which normally happens only
for unlinked files), we will not write out the inode in iput_final() and
the dirty data just gets discarded in ->evict_inode(). OCFS2 doesn't want
this so they have to write-out by hand.

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR