Re: [PATCH v5 09/16] kexec: enable KHO support for memory preservation

From: Pratyush Yadav
Date: Thu Apr 03 2025 - 11:50:33 EST


Hi all,

The below patch implements the table based memory preservation mechanism
I suggested. It is a replacement of this patch. Instead of using an
xarray of bitmaps and converting them into a linked list of bitmaps at
serialization time, it tracks preserved pages in a page table like
format, that needs no extra work when serializing. This results in
noticeably better performance when preserving a large number of pages.

To compare performance, I allocated 48 GiB of memory and preserved it
using KHO. Below is the time taken to make the reservations, and then
serialize that to FDT.

Linked list: 577ms +- 0.7% (6 samples)
Table: 469ms +- 0.6% (6 samples)

>From this, we can see that the table is almost 19% faster.

This test was done with only one thread, but since it is possible to
make reservations in parallel, the performance would increase even more
-- especially since the linked list serialization cannot be parallelized
easily.

In terms of memory usage, I could not collect reliable data, but I don't
think there should be significant difference between either approach
since the bitmaps are the same density, and only difference would be
extra metadata (chunks vs upper level tables).

Memory usage for tables can be further optimized if needed by collapsing
full tables. That is, if all bits in a L1 table are set, we can just not
allocate a page for it, and instead set a flag in the L2 descriptor.

The patch currently has a limitation where it does not free any of the
empty tables after a unpreserve operation. But Changyuan's patch also
doesn't do it so at least it is not any worse off.

In terms of code size, I believe both are roughly the same. This patch
is 609 lines compared Changyuan's 522, many of which come from the
longer comment.

When working on this patch, I realized that kho_mem_deserialize() is
currently _very_ slow. It takes over 2 seconds to make memblock
reservations for 48 GiB of 0-order pages. I suppose this can later be
optimized by teaching memblock_free_all() to skip preserved pages
instead of making memblock reservations.

Regards,
Pratyush Yadav

---- 8< ----