Re: kupdate weirdness
From: Andrew Morton
Date: Thu Aug 02 2007 - 15:18:56 EST
On Thu, 02 Aug 2007 17:52:39 +0200
Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> > > The following strange behavior can be observed:
> > >
> > > 1. large file is written
> > > 2. after 30 seconds, nr_dirty goes down by 1024
> > > 3. then for some time (< 30 sec) nothing happens (disk idle)
> > > 4. then nr_dirty again goes down by 1024
> > > 5. repeat from 3. until whole file is written
> > >
> > > So basically a 4Mbyte chunk of the file is written every 30 seconds.
> > > I'm quite sure this is not the intended behavior.
> > >
> > > The reason seems to be that __sync_single_inode() will move the
> > > partially written inode from s_io onto s_dirty, and sync_sb_inode()
> > > will not splice it back onto s_io until the rest of the inodes on s_io
> > > has been processed.
> >
> > It does all sorts of weird crap.
> >
> > > Since there will probably be a recently dirtied inode on s_io, this
> > > will take some of time, but always less than 30 sec.
> > >
> > > I don't know what's the easiest solution.
> > >
> > > Any ideas?
> >
> > Try 2.6.23-rc1-mm2.
>
> Much better, but still not perfect.
I've kinda lost track of the status of all these patches. I _think_ Ken
has identified a remaining problem even after his
writeback-fix-periodic-superblock-dirty-inode-flushing.patch, but maybe I
misremember.
Ken, can you remind us of the status there, please?
> Now it writes out 1024 pages after 30 seconds and then the rest after
> another 30s.
Bah.
> If my analysis is correct, this is because when it first gets onto
> s_io other inodes will get there too (with up-to 30s later dirying
> time), and the contents of s_more_io won't be recycled until the
> current contents of s_io are processed.
>
> Maybe this is OK, the previous weird stuff didn't seem to bother a lot
> of people either.
There were heaps of problems in there and it is surprising how few people
were hitting them. Ordered-mode journalling filesystems will fix it all up
behind the scenes, of course.
I just have a bad feeling about that code - list_heads are the wrong data
structure and it all needs to be ripped and redone using some indexable
data structure. There has been desultory discussion, but nothing's
happening and nothing will happen in the medium term, so we need to keep
on whapping bandainds on it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/