Re: writeout stalls in current -git

From: David Chinner
Date: Mon Nov 05 2007 - 23:26:09 EST


On Mon, Nov 05, 2007 at 07:27:16PM +0100, Torsten Kaiser wrote:
> On 11/5/07, David Chinner <dgc@xxxxxxx> wrote:
> > Ok, so it's probably a side effect of the writeback changes.
> >
> > Attached are two patches (two because one was in a separate patchset as
> > a standalone change) that should prevent async writeback from blocking
> > on locked inode cluster buffers. Apply the xfs-factor-inotobp patch first.
> > Can you see if this fixes the problem?
>
> Now testing v2.6.24-rc1-650-gb55d1b1+ the fix for the missapplied raid5-patch
> Applying your two patches ontop of that does not fix the stalls.

So you are having RAID5 problems as well?

I'm struggling to understand what possible changed in XFS or writeback that
would lead to stalls like this, esp. as you appear to be removing files when
the stalls occur. Rather than vmstat, can you use something like iostat to
show how busy your disks are? i.e. are we seeing RMW cycles in the raid5 or
some such issue.

OOC, what is the 'xfs_info <mtpt>' output for your filesystem?

> vmstat 10 output from unmerging (uninstalling) a kernel:
> 1 0 0 3512188 332 192644 0 0 185 12 368 735 10 3 85 1
> -> emerge starts to remove the kernel source files
> 3 0 0 3506624 332 192836 0 0 15 9825 2458 8307 7 12 81 0
> 0 0 0 3507212 332 192836 0 0 0 554 630 1233 0 1 99 0
> 0 0 0 3507292 332 192836 0 0 0 537 580 1328 0 1 99 0
> 0 0 0 3507168 332 192836 0 0 0 633 626 1380 0 1 99 0
> 0 0 0 3507116 332 192836 0 0 0 1510 768 2030 1 2 97 0
> 0 0 0 3507596 332 192836 0 0 0 524 540 1544 0 0 99 0
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 0 0 3507540 332 192836 0 0 0 489 551 1293 0 0 99 0
> 0 0 0 3507528 332 192836 0 0 0 527 510 1432 1 1 99 0
> 0 0 0 3508052 332 192840 0 0 0 2088 910 2964 2 3 95 0
> 0 0 0 3507888 332 192840 0 0 0 442 565 1383 1 1 99 0
> 0 0 0 3508704 332 192840 0 0 0 497 529 1479 0 0 99 0
> 0 0 0 3508704 332 192840 0 0 0 594 595 1458 0 0 99 0
> 0 0 0 3511492 332 192840 0 0 0 2381 1028 2941 2 3 95 0
> 0 0 0 3510684 332 192840 0 0 0 699 600 1390 0 0 99 0
> 0 0 0 3511636 332 192840 0 0 0 741 661 1641 0 0 100 0
> 0 0 0 3524020 332 192840 0 0 0 2452 1080 3910 2 3 95 0
> 0 0 0 3524040 332 192844 0 0 0 530 617 1297 0 0 99 0
> 0 0 0 3524128 332 192844 0 0 0 812 674 1667 0 1 99 0
> 0 0 0 3527000 332 193672 0 0 339 721 754 1681 3 2 93 1
> -> emerge is finished, no dirty or writeback data in /proc/meminfo

At this point, can you run a "sync" and see how long that takes to
complete? The only thing I can think that woul dbe written out after
this point is inodes, but even then it seems to go on for a long,
long time and it really doesn't seem like XFS is holding up the
inode writes.

Another option is to use blktrace/blkparse to determine which process is
issuing this I/O.

> 0 0 0 3583780 332 195060 0 0 0 494 555 1080 0 1 99 0
> 0 0 0 3584352 332 195060 0 0 0 99 347 559 0 0 99 0
> 0 0 0 3585232 332 195060 0 0 0 11 301 621 0 0 99 0
> -> disks go idle.
>
> So these patches do not seem to be the source of these excessive disk writes...

Well, the patches I posted should prevent blocking in the places that it
was seen, so if that does not stop the slowdowns then either the writeback
code is not feeding us inodes fast enough or the block device below is
having some kind of problem....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/