Re: [RFC v1][PATCH]page_fault retry with NOPAGE_RETRY

From: Peter Zijlstra
Date: Thu Nov 27 2008 - 05:00:25 EST


On Thu, 2008-11-27 at 01:28 -0800, Mike Waychison wrote:

> Correct. I don't recall the numbers from the pathelogical cases we were
> seeing, but iirc, it was on the order of 10s of seconds, likely
> exascerbated by slower than usual disks. I've been digging through my
> inbox to find numbers without much success -- we've been using a variant
> of this patch since 2.6.11.

> We generally try to avoid such things, but sometimes it a) can't be
> easily avoided (third party libraries for instance) and b) when it hits
> us, it affects the overall health of the machine/cluster (the monitoring
> daemons get blocked, which isn't very healthy).

If its only monitoring, there might be another solution. If you can keep
the required data in a separate (approximate) copy so that you don't
need mmap_sem at all to show them.

If your mmap_sem is so contended your latencies are unacceptable, adding
more users to it - even statistics gathering, just isn't going to cure
the situation.

Furthermore, /proc code usually isn't written with performance in mind,
so its usually simple and robust code. Adding it to a 'hot'-path like
you're doing doesn't seem advisable.

Also, releasing and re-acquiring mmap_sem can significantly add to the
cacheline bouncing that thing already has.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/