Re: slow performance on disk/network i/o full speed afterdrop_caches

From: Wu Fengguang
Date: Thu Aug 25 2011 - 23:26:56 EST


On Fri, Aug 26, 2011 at 11:13:07AM +0800, Stefan Priebe wrote:
>
> >> There is at least a numastat proc file.
> >
> > Thanks. This shows that node0 is accessed 10x more than node1.
>
> What can i do to prevent this or isn't this normal when a machine mostly idles so processes are mostly processed by cpu0.

Yes, that's normal. However it should explain why it's slow even when
there are lots of free pages _globally_.

> >
> >> complete ps output:
> >> http://pastebin.com/raw.php?i=b948svzN
> >
> > In that log, scp happens to be in R state and also no other tasks in D
> > state. Would you retry in the hope of catching some stucked state?
> Sadly not as the sysrq trigger has rebootet the machine and it will now run fine for 1 or 2 days.

Oops, sorry! It might be possible to reproduce the issue by manually
eating all of the memory with sparse file data:

truncate -s 1T 1T
cp 1T /dev/null

> >
> >>> echo t> /proc/sysrq-trigger
> >> sadly i wa sonly able to grab the output in this crazy format:
> >> http://pastebin.com/raw.php?i=MBXvvyH1
> >
> > It's pretty readable dmesg, except that the data is incomplete and
> > there are nothing valuable in the uploaded portion..
> That was everything i could grab through netconsole. Is there a better way?

netconsole is enough. The partial output should be due to the reboot...

Thanks,
Fengguang

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/