Re: Queue upcall locking (was: [dm-devel] [RFC][PATCH] fixdm_any_congested() to properly sync up with suspend code path)

From: Mikulas Patocka
Date: Tue Nov 11 2008 - 17:09:01 EST


On Tue, 11 Nov 2008, Christoph Hellwig wrote:

> On Mon, Nov 10, 2008 at 09:19:21AM -0500, Mikulas Patocka wrote:
> > For device mapper, congested_fn asks every device in the tree and make OR
> > of their bits --- so if the user has 50 devices, it asks them all.
> >
> > For md-linear, md-raid0, md-raid1, md-raid10 and md-multipath it does the
> > same --- asking every device.
> >
> > If you have a better idea how to implement congested_fn, say it.
>
> Well, even asking 50 devices shouldn't take that long normally.

There are some atomic operations inside congested_fn.

I remember that there was some database benchmark on a setup with 48
devices. Just dropping one spinlock from unplug function resulted in
overall improvement (not microbenchmarks, the whole run) by 18%. The
slowdown with unplug is similar as with congested_fn --- unplugs walks all
the physical devices for a given logical device and unplugs them.

So simplifying congested_fn might help too. Do you have an idea, how often
is it called? I thought about caching the return value for one jiffy.

> Do you take some sleeping lock that might block forever? In that case I
> would just return congested for that device as it would delay pdflush
> otherwise.

Sometimes we do, and we shouldn't, because congested_fn is called with
spinlock. That's what I'm working on now.

BTW. how "realtime" do you expect the Linux kernel to be? Is there some
effort to make all spinlocks locked for finite time? (so that someone
could one day calculate the times of all spinlocks, take the maximum time
and give Linux a "realtime" label). If so, then bdi_congested call should
be moved out of spinlocks. If there are already many cases when spinlock
could be taken and list of indefinite length walked inside, it isn't worth
optimizing for it.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/