[PATCH v3] proc: use untagged_addr() for pagemap_read addresses

From: Miles Chen
Date: Thu Dec 03 2020 - 21:44:43 EST


When we try to visit the pagemap of a tagged userspace pointer, we find
that the start_vaddr is not correct because of the tag.
To fix it, we should untag the userspace pointers in pagemap_read().

I tested with 5.10-rc4 and the issue remains.

Explanation from Catalin in [1]:

:Arguably, that's a user-space bug since tagged file offsets were never
:supported. In this case it's not even a tag at bit 56 as per the arm64
:tagged address ABI but rather down to bit 47. You could say that the
:problem is caused by the C library (malloc()) or whoever created the
:tagged vaddr and passed it to this function. It's not a kernel
:regression as we've never supported it.
:
:Now, pagemap is a special case where the offset is usually not generated
:as a classic file offset but rather derived by shifting a user virtual
:address. I guess we can make a concession for pagemap (only) and allow
:such offset with the tag at bit (56 - PAGE_SHIFT + 3).

My test code is based on [2]:

A userspace pointer which has been tagged by 0xb4: 0xb400007662f541c8

=== userspace program ===

uint64 OsLayer::VirtualToPhysical(void *vaddr) {
uint64 frame, paddr, pfnmask, pagemask;
int pagesize = sysconf(_SC_PAGESIZE);
off64_t off = ((uintptr_t)vaddr) / pagesize * 8; // off = 0xb400007662f541c8 / pagesize * 8 = 0x5a00003b317aa0
int fd = open(kPagemapPath, O_RDONLY);
...

if (lseek64(fd, off, SEEK_SET) != off || read(fd, &frame, 8) != 8) {
int err = errno;
string errtxt = ErrorString(err);
if (fd >= 0)
close(fd);
return 0;
}
...
}

=== kernel fs/proc/task_mmu.c ===

static ssize_t pagemap_read(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
...
src = *ppos;
svpfn = src / PM_ENTRY_BYTES; // svpfn == 0xb400007662f54
start_vaddr = svpfn << PAGE_SHIFT; // start_vaddr == 0xb400007662f54000
end_vaddr = mm->task_size;

/* watch out for wraparound */
// svpfn == 0xb400007662f54
// (mm->task_size >> PAGE) == 0x8000000
if (svpfn > mm->task_size >> PAGE_SHIFT) // the condition is true because of the tag 0xb4
start_vaddr = end_vaddr;

ret = 0;
while (count && (start_vaddr < end_vaddr)) { // we cannot visit correct entry because start_vaddr is set to end_vaddr
int len;
unsigned long end;
...
}
...
}

[1] https://lore.kernel.org/patchwork/patch/1343258/
[2] https://github.com/stressapptest/stressapptest/blob/master/src/os.cc#L158

Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Alexey Dobriyan <adobriyan@xxxxxxxxx>
Cc: Andrey Konovalov <andreyknvl@xxxxxxxxxx>
Cc: Alexander Potapenko <glider@xxxxxxxxxx>
Cc: Vincenzo Frascino <vincenzo.frascino@xxxxxxx>
Cc: Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx>
Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
Cc: Marco Elver <elver@xxxxxxxxxx>
Cc: Will Deacon <will@xxxxxxxxxx>
Cc: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
Cc: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx # v5.4-
Signed-off-by: Miles Chen <miles.chen@xxxxxxxxxxxx>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@xxxxxxx>
Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx>

---

Change since v1:

1. Follow Eirc's and Catalin's suggestion to avoid overflow
2. Cc to stable v5.4-
3. add explaination from Catalin to the commit message

Change since v2:
1. replace less-than with less-than or equal
2. Fix bad spelling in commit message
3. Fix will's email address
---
fs/proc/task_mmu.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 217aa2705d5d..ee5a235b3056 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1599,11 +1599,15 @@ static ssize_t pagemap_read(struct file *file, char __user *buf,

src = *ppos;
svpfn = src / PM_ENTRY_BYTES;
- start_vaddr = svpfn << PAGE_SHIFT;
end_vaddr = mm->task_size;

/* watch out for wraparound */
- if (svpfn > mm->task_size >> PAGE_SHIFT)
+ start_vaddr = end_vaddr;
+ if (svpfn <= (ULONG_MAX >> PAGE_SHIFT))
+ start_vaddr = untagged_addr(svpfn << PAGE_SHIFT);
+
+ /* Ensure the address is inside the task */
+ if (start_vaddr > mm->task_size)
start_vaddr = end_vaddr;

/*
--
2.18.0