Re: [PATCH] mm: dump_page: add debugfs file for dumping page state by pfn

From: Konstantin Khlebnikov
Date: Mon May 25 2020 - 12:05:15 EST


On 25/05/2020 19.03, Konstantin Khlebnikov wrote:

On 25/05/2020 18.33, Matthew Wilcox wrote:
On Mon, May 25, 2020 at 05:19:11PM +0300, Konstantin Khlebnikov wrote:
Tool 'page-types' could list pages mapped by process or file cache pages,
but it shows only limited amount of state exported via procfs.

Let's employ existing helper dump_page() to reach remaining information:
writing pfn into /sys/kernel/debug/dump_page dumps state into kernel log.

# echo 0x37c43c > /sys/kernel/debug/dump_page
# dmesg | tail -6
 page:ffffcb0b0df10f00 refcount:1 mapcount:0 mapping:000000007755d3d9 index:0x30
 0xffffffffae4239e0 name:""
 flags: 0x200000000020014(uptodate|lru|mappedtodisk)
 raw: 0200000000020014 ffffcb0b187fd288 ffffcb0b189e6248 ffff9528a04afe10
 raw: 0000000000000030 0000000000000000 00000001ffffffff 0000000000000000
 page dumped because: debugfs request

This makes me deeply uncomfortable. We're using %px, and %lx
(for the 'raw' lines) so we actually get to see kernel addresses.
We've rationalised this in the past as being acceptable because you're
already in an "assert triggered" kind of situation. Now you're adding
a way for any process with CAP_SYS_ADMIN to get kernel addresses dumped
into the syslog.

I think we need a different function for this, or we need to re-audit
dump_page() for exposing kernel pointers, and not expose the raw data
in struct page.

It's better to add switch for disabling paranoia if bad things happening.
I.e. keep everything safe by default (or whatever sysctl/config set) and
flip the switch when needed.

Also I'm ok to seal this interface if kernel in mode of serious paranoia.