Re: [RFC][DATA] re "ongoing vm suckage"

From: Linus Torvalds (torvalds@transmeta.com)
Date: Tue Aug 07 2001 - 17:03:40 EST


In article <Pine.LNX.4.33.0108071426380.30280-100000@touchme.toronto.redhat.com>,
Ben LaHaise <bcrl@redhat.com> wrote:
>
>Yes, but I'm using raid 0. The ratio of highmem to normal memory is
>~3.25:1, and it would seem that this is breaking write throttling somehow.

Ahh - I see it.

Check "nr_free_buffer_pages()" - and notice how the function is meant to
return the number of pages that can be used for buffers.

But the function doesn't understand about the limitations of buffer
allocations inherent in GFP_NOFS, namely that it won't ever allocate a
high-mem buffer. So it just stupidly adds up the number of free pages,
coming to the conclusion that we have a _lot_ of memory that buffers
could use..

This obviously makes the whole balance_dirty() algorithm not work at
all.

This should be fairly easy to do. Instead of counting all zones,
nr_free_buffer_pages() should count only the zones that are listed in
the GFP_NOFS zonelist. So instead of using

        unsigned int sum;

        sum = nr_free_pages();
        sum += nr_inactive_clean_pages();
        sum += nr_inactive_dirty_pages;

it should do something like this instead (but please hide the "zonelist"
lookup behind some nice macro, I almost lost my lunch when I wrote that
;)

        unsigned int sum = 0;
        zonelist_t *zonelist = contig_page_data.node_zonelists+(gfp_mask & GFP_ZONEMASK);
        zone_t **zonep = zonelist->zones, *zone;

        for (;;) {
                zone_t *zone = *zonep;
                if (!zone)
                        return sum;
                sum += zone->free_pages + zone->inactive_clean_pages + zone->inactive_dirty_pages;
        }

which is more accurate, and actually faster to boot (look at what
"nr_free_pages()" and friends do - they already walk all the zones)

I can't easily test this - mind giving it a whirl?

                Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Aug 07 2001 - 21:00:48 EST