Re: slow performance on disk/network i/o full speed afterdrop_caches

From: Wu Fengguang
Date: Wed Aug 24 2011 - 05:32:15 EST


On Wed, Aug 24, 2011 at 02:20:07PM +0800, Pekka Enberg wrote:
> On Wed, Aug 24, 2011 at 9:06 AM, Stefan Priebe - Profihost AG
> <s.priebe@xxxxxxxxxxxx> wrote:
> > i hope this is the correct list to write to if it would be nice to give me a
> > hint where i can ask.
> >
> > Kernel: 2.6.38
> >
> > I'm seeing some strange problems on some of our servers after upgrading to
> > 2.6.38.
> >
> > I'm copying a 1GB file via scp from Machine A to Machine B. When B is
> > freshly booted the file transfer is done with about 80 to 85 Mb/s. I can
> > repeat that various times to performance degrease.
> >
> > Then after some days copying is only done with about 900kb/s up to 3Mb/s
> > going up and down while transfering the file.
> >
> > When i then do drop_caches it works again on 80Mb/s.
> >
> > sync && echo 3 >/proc/sys/vm/drop_caches && sleep 2 && echo 0
> >>/proc/sys/vm/drop_caches
> >
> > Attached is also an output of meminfo before and after drop_caches.
> >
> > What's going on here? MemFree is pretty high.
> >
> > Please CC me i'm not on list.
>
> Interesting. I can imagine one or more of the following to be
> involved: networking, vmscan, block, and writeback. Lets CC all of
> them!
>
> > # before drop_caches
> >
> > # cat /proc/meminfo
> > MemTotal: Â Â Â Â8185544 kB
> > MemFree: Â Â Â Â 6670292 kB
> > Buffers: Â Â Â Â Â105164 kB
> > Cached: Â Â Â Â Â 166672 kB
> > SwapCached: Â Â Â Â Â Â0 kB
> > Active: Â Â Â Â Â 728308 kB
> > Inactive: Â Â Â Â 567428 kB
> > Active(anon): Â Â 639204 kB
> > Inactive(anon): Â 394932 kB
> > Active(file): Â Â Â89104 kB
> > Inactive(file): Â 172496 kB
> > Unevictable: Â Â Â Â2976 kB
> > Mlocked: Â Â Â Â Â Â2992 kB
> > SwapTotal: Â Â Â 1464316 kB
> > SwapFree: Â Â Â Â1464316 kB
> > Dirty: Â Â Â Â Â Â Â Â52 kB
> > Writeback: Â Â Â Â Â Â 0 kB

Since dirty/writeback pages are low, it seems not being throttled by
balance_dirty_pages().

Stefan, would you please run this several times on the server?

ps -eo user,pid,tid,class,rtprio,ni,pri,psr,pcpu,vsz,rss,pmem,stat,wchan:28,cmd | grep scp

It will show where the scp task is blocked (the wchan field). Hope it helps.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/