Re: 2.4 fix for write throttling on x86 >1G

From: Marcelo Tosatti
Date: Fri Mar 11 2005 - 15:40:16 EST


Hi Andrea!

On Fri, Mar 11, 2005 at 07:10:35AM +0100, Andrea Arcangeli wrote:
> Hello Marcelo,
>
> I've got a fix for you on 2.4. I got reports of stalls with heavy writes
> on 2.4.

Out of curiosity, that was SuSE not mainline ?

> There was a mistake in nr_free_buffer_pages. That function is
> definitely meant _not_ to take highmem into account (dirty cache cannot
> spread over highmem in 2.4 [even when on top of fs]). For unknown
> reasons it was actually taking highmem into account. The code was
> obviously meant to not take inot account see the GFP_USER and zonelist,
> except it wasn't using the zonelist.

True, initialization of "zone" variable in nr_free_buffer_pages() is
un-nice.

> That is a severe problem because
> there will be no write throttling at all, and no bdflush wakeup either.
>
> This should fix it, though my compiler fails to compile 2.4, so it's not
> immediate to verify it. If any problem showup I'll post a followup.
>
> This is a noop for all systems <800M (1G shouldn't be noticeable
> either). This is why most people can't notice.

Do we really want to limit dirty cache to low mem on HIGHIO capable
machines? I'm afraid doing so might hurt performance on such systems.

I think it might be wise to have nr_free_buffer_pages() take highmem
into account if CONFIG_HIGHIO is set ?

> --- 2.4.23aa3/mm/page_alloc.c.~1~ 2004-07-04 02:09:42.000000000 +0200
> +++ 2.4.23aa3/mm/page_alloc.c 2005-03-11 07:00:23.000000000 +0100
> @@ -656,7 +656,7 @@ unsigned int nr_free_buffer_pages (void)
> class_idx = zone_idx(zone);
>
> sum += zone->nr_cache_pages;
> - for (zone = pgdat->node_zones; zone < pgdat->node_zones + MAX_NR_ZONES; zone++) {
> + for (; zone; zone = *zonep++) {
> int free = zone->free_pages - zone->watermarks[class_idx].high;
> if (free <= 0)
> continue;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/