Re: [PATCH v6 0/3] mm: Free contiguous order-0 pages efficiently

Next message: Anirudh Rayabharam: "Re: [PATCH V1 04/13] mshv: Provide a way to get partition id if running in a VMM process"
Previous message: Mike Marciniszyn: "Re: [PATCH net-next 1/4] net: eth: fbnic: Fix addr validation in pcs write"
Next in thread: Andrew Morton: "Re: [PATCH v6 0/3] mm: Free contiguous order-0 pages efficiently"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Johannes Weiner

Date: Wed Apr 29 2026 - 06:40:12 EST

On Wed, Apr 01, 2026 at 11:16:18AM +0100, Muhammad Usama Anjum wrote:
> Hi All,
>
> A recent change to vmalloc caused some performance benchmark regressions (see
> [1]). I'm attempting to fix that (and at the same time significantly improve
> beyond the baseline) by freeing a contiguous set of order-0 pages as a batch.

I think we should revert the original patch.

The premise is that we can save some allocator calls by requesting
higher orders and splitting them up into singles. This is a frivolous
and short-sighted use of a very coveted and expensive resource.

The buddy allocator tries hard to retain contiguity *if it isn't
needed by the caller*. This patch actively works around that.

The cost of recreating those higher orders elsewhere is shouldered by
whoever actually needs the contiguity down the line. And that process
is orders of magnitudes more expensive than we save here:

We're saving cycles per page in the vmalloc path, and later spend tens
of thousands of cycles per page to recreate the contiguity. Scanning
PFN ranges, folio locks, rmap walks, TLB flushes, page copies.

That's a terrible trade-off.