Re: [PATCH] O6int for interactivity

From: Wiktor Wodecki (wodecki@gmx.de)
Date: Sat Jul 19 2003 - 05:59:33 EST


On Fri, Jul 18, 2003 at 09:38:10PM +1000, Nick Piggin wrote:
>
>
> Wiktor Wodecki wrote:
>
> >On Fri, Jul 18, 2003 at 08:43:05PM +1000, Con Kolivas wrote:
> >
> >>On Fri, 18 Jul 2003 20:31, Wiktor Wodecki wrote:
> >>
> >>>On Fri, Jul 18, 2003 at 12:18:33PM +0200, Mike Galbraith wrote:
> >>>
> >>>>That _might_ (add salt) be priorities of kernel threads dropping too
> >>>>low.
> >>>>
> >>>>I'm also seeing occasional total stalls under heavy I/O in the order of
> >>>>10-12 seconds (even the disk stops). I have no idea if that's something
> >>>>in mm or the scheduler changes though, as I've yet to do any isolation
> >>>>and/or tinkering. All I know at this point is that I haven't seen it in
> >>>>stock yet.
> >>>>
> >>>I've seen this too while doing a huge nfs transfer from a 2.6 machine to
> >>>a 2.4 machine (sparc32). Thought it'd be something with the nfs changes
> >>>which were recently, might be the scheduler, tho. Ah, and it is fully
> >>>reproducable.
> >>>
> >>Well I didn't want to post this yet because I'm not sure if it's a good
> >>workaround yet but it looks like a reasonable compromise, and since you
> >>have a testcase it will be interesting to see if it addresses it. It's
> >>possible that a task is being requeued every millisecond, and this is a
> >>little smarter. It allows cpu hogs to run for 100ms before being round
> >>robinned, but shorter for interactive tasks. Can you try this O7 which
> >>applies on top of O6.1 please:
> >>
> >>available here:
> >>http://kernel.kolivas.org/2.5
> >>
> >
> >sorry, the problem still persists. Aborting the cp takes less time, tho
> >(about 10 seconds now, before it was about 30 secs). I'm aborting during
> >a big file, FYI.
> >
>
> OK if the IO actually stops then it shouldn't be an IO scheduler or
> request allocation problem, but could you try to capture a sysrq T
> trace for me during the freeze.

okay, here it is. I only paste the output for cp, if you need the whole
thing, tell me please.

Jul 19 12:54:16 kakerlak kernel: cp D C0140F7B 6164 6160
(NOTLB)
Jul 19 12:54:16 kakerlak kernel: c2c6fec8 00200082 d3de680c c0140f7b
d3de6800 c72ef000 c0477cc0 d3de681c
Jul 19 12:54:16 kakerlak kernel: d139ad60 c2c6e000 ce8dc3c0
ce8dc3dc 00000000 c01aba07 00000000 00000001
Jul 19 12:54:16 kakerlak kernel: 00000000 00000001 ce8153e4
00000000 c26706d0 c011cdb0 ce8dc3dc ce8dc3dc
Jul 19 12:54:16 kakerlak kernel: Call Trace:
Jul 19 12:54:16 kakerlak kernel: [free_block+203/256]
free_block+0xcb/0x100
Jul 19 12:54:16 kakerlak kernel: [nfs_wait_on_request+151/336]
nfs_wait_on_request+0x97/0x150
Jul 19 12:54:16 kakerlak kernel: [default_wake_function+0/48]
default_wake_function+0x0/0x30
Jul 19 12:54:16 kakerlak kernel: [nfs_wait_on_requests+169/256]
nfs_wait_on_requests+0xa9/0x100
Jul 19 12:54:16 kakerlak kernel: [nfs_sync_file+150/192]
nfs_sync_file+0x96/0xc0
Jul 19 12:54:16 kakerlak kernel: [nfs_file_flush+88/208]
nfs_file_flush+0x58/0xd0
Jul 19 12:54:16 kakerlak kernel: [filp_close+101/128]
filp_close+0x65/0x80
Jul 19 12:54:16 kakerlak kernel: [sys_close+97/160] sys_close+0x61/0xa0
Jul 19 12:54:16 kakerlak kernel: [syscall_call+7/11]
syscall_call+0x7/0xb
Jul 19 12:54:16 kakerlak kernel:

-- 
Regards,

Wiktor Wodecki


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jul 23 2003 - 22:00:37 EST