Re: [PATCH 2/4] mm, procfs: account for shmem swap in /proc/pid/smaps

From: Konstantin Khlebnikov
Date: Wed Mar 11 2015 - 11:03:28 EST

On 11.03.2015 15:30, Konstantin Khlebnikov wrote:
On Thu, Feb 26, 2015 at 4:51 PM, Vlastimil Babka <vbabka@xxxxxxx> wrote:
Currently, /proc/pid/smaps will always show "Swap: 0 kB" for shmem-backed
mappings, even if the mapped portion does contain pages that were swapped out.
This is because unlike private anonymous mappings, shmem does not change pte
to swap entry, but pte_none when swapping the page out. In the smaps page
walk, such page thus looks like it was never faulted in.

Maybe just add count of swap entries allocated by mapped shmem into
swap usage of this vma? That's isn't exactly correct for partially
mapped shmem but this is something weird anyway.

Something like that (see patch in attachment)

This patch changes smaps_pte_entry() to determine the swap status for such
pte_none entries for shmem mappings, similarly to how mincore_page() does it.
Swapped out pages are thus accounted for.

The accounting is arguably still not as precise as for private anonymous
mappings, since now we will count also pages that the process in question never
accessed, but only another process populated them and then let them become
swapped out. I believe it is still less confusing and subtle than not showing
any swap usage by shmem mappings at all. Also, swapped out pages only becomee a
performance issue for future accesses, and we cannot predict those for neither
kind of mapping.

Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
Documentation/filesystems/proc.txt | 3 ++-
fs/proc/task_mmu.c | 20 ++++++++++++++++++++
2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index d4f56ec..8b30543 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -437,7 +437,8 @@ indicates the amount of memory currently marked as referenced or accessed.
a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE
and a page is modified, the file page is replaced by a private anonymous copy.
"Swap" shows how much would-be-anonymous memory is also used, but out on
+swap. For shmem mappings, "Swap" shows how much of the mapped portion of the
+underlying shmem object is on swap.

"VmFlags" field deserves a separate description. This member represents the kernel
flags associated with the particular virtual memory area in two letter encoded
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 956b75d..0410309 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -13,6 +13,7 @@
#include <linux/swap.h>
#include <linux/swapops.h>
#include <linux/mmu_notifier.h>
+#include <linux/shmem_fs.h>

#include <asm/elf.h>
#include <asm/uaccess.h>
@@ -496,6 +497,25 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
mss->swap += PAGE_SIZE;
else if (is_migration_entry(swpent))
page = migration_entry_to_page(swpent);
+ pte_none(*pte) && vma->vm_file) {
+ struct address_space *mapping =
+ file_inode(vma->vm_file)->i_mapping;
+ /*
+ * shmem does not use swap pte's so we have to consult
+ * the radix tree to account for swap
+ */
+ if (shmem_mapping(mapping)) {
+ page = find_get_entry(mapping, pgoff);
+ if (page) {
+ if (radix_tree_exceptional_entry(page))
+ mss->swap += PAGE_SIZE;
+ else
+ page_cache_release(page);
+ }
+ page = NULL;
+ }

if (!page)

To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx For more info on Linux MM,
see: .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at

shmem: show swap usage in smaps

From: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>

Signed-off-by: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>
fs/proc/task_mmu.c | 3 +++
include/linux/mm.h | 2 ++
mm/shmem.c | 8 ++++++++
3 files changed, 13 insertions(+)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 956b75d61809..09a94cec159e 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -624,6 +624,9 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
/* mmap_sem is held in m_start */
walk_page_vma(vma, &smaps_walk);

+ if (vma->vm_ops && vma->vm_ops->get_swap_usage)
+ mss.swap += vma->vm_ops->get_swap_usage(vma) << PAGE_SHIFT;
show_map_vma(m, vma, is_pid);

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6571dd78e984..477a46987859 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -292,6 +292,8 @@ struct vm_operations_struct {
struct page *(*find_special_page)(struct vm_area_struct *vma,
unsigned long addr);
+ unsigned long (*get_swap_usage)(struct vm_area_struct *vma);

struct mmu_gather;
diff --git a/mm/shmem.c b/mm/shmem.c
index cf2d0ca010bc..492f78f51fc2 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1363,6 +1363,13 @@ static struct mempolicy *shmem_get_policy(struct vm_area_struct *vma,

+static unsigned long shmem_get_swap_usage(struct vm_area_struct *vma)
+ struct inode *inode = file_inode(vma->vm_file);
+ return SHMEM_I(inode)->swapped;
int shmem_lock(struct file *file, int lock, struct user_struct *user)
struct inode *inode = file_inode(file);
@@ -3198,6 +3205,7 @@ static const struct vm_operations_struct shmem_vm_ops = {
.set_policy = shmem_set_policy,
.get_policy = shmem_get_policy,
+ .get_swap_usage = shmem_get_swap_usage,

static struct dentry *shmem_mount(struct file_system_type *fs_type,