Re: [RFC][PATCH 0/3] VM throttling: avoid blocking occasionalwriters

From: Brice Figureau
Date: Fri Mar 02 2007 - 04:36:41 EST

Next message: Sean Young: "Re: [BUG 2.6.21-rc2] divide error: 0000"
Previous message: Pavel Machek: "Re: [PATCH 4/4] coredump: documentation for proc entry"
In reply to: Leroy van Logchem: "Re: [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers"
Next in thread: Leroy van Logchem: "Re: [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

On Thu, 2007-03-01 at 12:47 +0000, Leroy van Logchem wrote:
> Tomoki Sekiyama <tomoki.sekiyama.qu <at> hitachi.com> writes:
> > thanks for your comments.
>
> The default dirty_ratio on most 2.6 kernels tend to be too large imo.
> If you are going to do sustained writes multiple times the size of
> the memory you have at least two problems.
>
> 1) The precious dentry and inodecache will be dropped leaving you with
> a *very* unresponsive system
> 2) The amount of dirty_pages which need to be flushed to disk is huge,
> taking all VM while the i/o channel takes uninterruptable time to flush it.
>
> What we really need is a 'cfq' for all processes -especially misbehaving
> ones like dd if=/dev/zero of=/location/large bs=1M count=10000-.
> If you want to DoS the 2.6 kernel, start a ever running dd write and
> you know what I mean. Huge latencies due the fact that all name_to_inode
> caches are lost and have to be fetched from disk again only to be quickly
> flushed again and again. I already explained this disaster scenario with
> Linus, Andrew and Jens; hoping for a auto-tuning solution which takes
> diskspeed per partition into account.
>
> At the moment we cope with this feature by preserving imported caches with
> sysctl vm.vfs_cache_pressure = 1, vm.dirty_ratio = 2 combined with
> vm.dirty_background_ratio = 1. Some benchmarks may get worse but you have
> a more resiliant server.
>
> I hope the VM subsystem will cope with applications which do not advise
> what to do with the cached pages. For now we use posix_fadvice DONT_NEED
> as patch to Samba 3 in order to at least be able to write larger then
> memory files without discarding the important slab caches.

I'm sorry to piggy-back this thread.

Could it be what I'm experiencing in the following bugzilla report:
http://bugzilla.kernel.org/show_bug.cgi?id=7372

As I explained in the report, I see this issue only since 2.6.18.
So if your concern is related to mine, what could have changed between
2.6.17 and 2.6.18 related to this?

Unfortunately it is not possible to git bisect this issue as my problem
appears on a live database server. One of the reporter could git-bisect
to a SATA+SCSI patch, but looking at it I can't really see what's wrong.
As soon as I will be able to build a 2.6.18 minus this patch we could
verify both of us are dealing with the same issue or not.

I'm ready to help any kernel developper who wants to look at my issue.
Just ask what debug material you need and I'll try to provide it.

Please CC: me on replies, as I'm not subscribed to the list.

Regards,
--
Brice Figureau <brice+lklm@xxxxxxxxxxxxxxxx>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Sean Young: "Re: [BUG 2.6.21-rc2] divide error: 0000"
Previous message: Pavel Machek: "Re: [PATCH 4/4] coredump: documentation for proc entry"
In reply to: Leroy van Logchem: "Re: [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers"
Next in thread: Leroy van Logchem: "Re: [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]