Re: [PATCH v2 1/1] Documentation: update pagemap with shmem exceptions

From: Peter Xu
Date: Mon Sep 20 2021 - 15:09:44 EST


Hi, Tiberiu,

Thanks for the patch! Yes it would still be nice to comment on this behavior,
some trivial nitpicks below.

On Mon, Sep 20, 2021 at 04:49:31PM +0000, Tiberiu A Georgescu wrote:
> +In user space, whether the page is swapped or none can be deduced with the
> +lseek system call. For a single page, the algorithm is:
> +
> +0. If the pagemap entry of the page has bit 63 (page present) set, the page
> + is present.
> +1. Otherwise, get an fd to the file where the page is backed. For anonymous
> + shared pages, the file can be found in ``/proc/pid/map_files/``.
> +2. Call lseek with LSEEK_DATA flag and seek to the virtual address of the page

s/LSEEK_DATA/SEEK_DATA/

> + you wish to inspect. If it overshoots the PAGE_SIZE, the page is NONE.
> +3. Otherwise, the page is in swap.

It could also not be in swap, right?

Example 1: this process mmap()ed an existing shmem file with data filled in,
but without accessing it yet. Then the page cache exists, not in swap, but
pgtables will be empty.

Example 2: this process has mapped this shmem with 2M thp, all data filled in,
then due to some reason thp splits, then the pgtable can also be none but lseek
will succeed, I think.

So to further identify whether that's in swap, we need a step 5 with mincore()
system call, perhaps?

--
Peter Xu