Re: [PATCH v4 1/3] mm, swap: speed up hibernation allocation and writeout

From: Andrew Morton

Date: Mon Feb 16 2026 - 16:42:38 EST


On Mon, 16 Feb 2026 22:58:02 +0800 Kairui Song via B4 Relay <devnull+kasong.tencent.com@xxxxxxxxxx> wrote:

> From: Kairui Song <kasong@xxxxxxxxxxx>
>
> Since commit 0ff67f990bd4 ("mm, swap: remove swap slot cache"),
> hibernation has been using the swap slot slow allocation path for
> simplification, which turns out might cause regression for some
> devices because the allocator now rotates clusters too often, leading to
> slower allocation and more random distribution of data.
>
> Fast allocation is not complex, so implement hibernation support as
> well.
>
> Test result with Samsung SSD 830 Series (SATA II, 3.0 Gbps) shows the
> performance is several times better [1]:
> 6.19: 324 seconds
> After this series: 35 seconds

Thanks.

I'll merge only [1/3] at this time, into mm-unstable at this time (I'll
move it to mm-unstable after resyncing mm.git with upstream).

We don't want the other two patches present during testing of this
backportable fix because doing so partially invalidates that testing -
[2/3] and[3/3] might accidentally fix issues which [1/3] added. It happens,
occasionally.

> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1926,8 +1926,9 @@ void swap_put_entries_direct(swp_entry_t entry, int nr)
> /* Allocate a slot for hibernation */
> swp_entry_t swap_alloc_hibernation_slot(int type)
> {
> - struct swap_info_struct *si = swap_type_to_info(type);
> - unsigned long offset;
> + struct swap_info_struct *pcp_si, *si = swap_type_to_info(type);
> + unsigned long pcp_offset, offset = SWAP_ENTRY_INVALID;
> + struct swap_cluster_info *ci;
> swp_entry_t entry = {0};
>
> if (!si)
> @@ -1937,11 +1938,21 @@ swp_entry_t swap_alloc_hibernation_slot(int type)
> if (get_swap_device_info(si)) {
> if (si->flags & SWP_WRITEOK) {
> /*
> - * Grab the local lock to be compliant
> - * with swap table allocation.
> + * Try the local cluster first if it matches the device. If
> + * not, try grab a new cluster and override local cluster.
> */

nanonit, worrying about 80-cols is rather old fashioned but there's no
reason to overflow 80 in a block comment!

> local_lock(&percpu_swap_cluster.lock);
> - offset = cluster_alloc_swap_entry(si, NULL);
> + pcp_si = this_cpu_read(percpu_swap_cluster.si[0]);
> + pcp_offset = this_cpu_read(percpu_swap_cluster.offset[0]);
> + if (pcp_si == si && pcp_offset) {
> + ci = swap_cluster_lock(si, pcp_offset);
> + if (cluster_is_usable(ci, 0))
> + offset = alloc_swap_scan_cluster(si, ci, NULL, pcp_offset);
> + else
> + swap_cluster_unlock(ci);
> + }
> + if (!offset)
> + offset = cluster_alloc_swap_entry(si, NULL);
> local_unlock(&percpu_swap_cluster.lock);
> if (offset)
> entry = swp_entry(si->type, offset);