Re: VM deadlock

From: Chris Mason (mason@suse.com)
Date: Thu Jun 28 2001 - 07:53:42 EST


On Thursday, June 28, 2001 01:21:28 PM +1000 Andrew Morton
<andrewm@uow.edu.au> wrote:

> Chris Mason wrote:
>>
>> ...
>> The work around I've been using is the dirty_inode method. Whenever
>> mark_inode_dirty is called, reiserfs logs the dirty inode. This means
>> inode changes are _always_ reflected in the buffer cache right away, and
>> the inode itself is never actually dirty.
>
> reiserfs_mark_inode_dirty() has taken a copy of the in-core inode, so
> it can do this:
>
> spin_lock(&inode_lock);
> if ((inode->i_state & I_LOCK) == 0)
> inode->i_state &= ~(I_DIRTY_SYNC|I_DIRTY_DATASYNC);
> spin_unlock(&inode_lock);
>
> Unfortunately there is no API function to do this, so inode_lock
> needs to be exported :(

Well, this is kind of my own fault. I didn't want the dirty_inode call
back to be able to screw with the internals of how inode.c dealt with
things, I wanted it purely to allow actions in addition to what inode.c
wanted to do.

So, mark_inode_dirty calls dirty_inode, and then it sets whatever dirty
bits it wants to. Clearing them in your own dirty_inode call won't matter,
they should just get set again later.

If we really want to leave the inode clean, fsync isn't as much of a
concern as O_SYNC writes, since you want generic_osync_inode to properly
flush the updated inode. But, that can be dealt with by having your
commit_write func test for O_SYNC.

What we can't get around is our friend knfsd, who uses write_inode_now.
The I_DIRTY bit needs to be accurate there (although it doesn't seem
perfect right now anyway).

The real problem I see is that we've overload the sync flag to write_inode.
It means flush now to get the data safe, and flush now to free ram.
Normally this kind of overloading is ok, but once logging comes into play I
believe a distinction is needed.

So, my current plan to fix reiserfs_write_inode is to do nothing when
current->flags & PF_MEMALLOC == 1. I'm not wild about it, but don't see
many other fixes that don't involve api changes.

I'd rather not do a private inode list until there is a clean way to apply
memory pressure to it, since reiserfs pins enough memory as it is.

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jun 30 2001 - 21:00:18 EST