Re: [PATCH v6 2/3] fs/proc/task_mmu: Implement IOCTL to get and/or the clear info about PTEs

From: Muhammad Usama Anjum
Date: Tue Dec 13 2022 - 08:04:27 EST


On 12/13/22 1:42 AM, Cyrill Gorcunov wrote:
> On Wed, Nov 09, 2022 at 03:23:02PM +0500, Muhammad Usama Anjum wrote:
> ...
>> +
>> +static long do_pagemap_sd_cmd(struct mm_struct *mm, struct pagemap_scan_arg *arg)
>> +{
>> + struct mmu_notifier_range range;
>> + unsigned long __user start, end;
>> + struct pagemap_scan_private p;
>> + int ret;
>> +
>> + start = (unsigned long)untagged_addr(arg->start);
>> + if ((!IS_ALIGNED(start, PAGE_SIZE)) || (!access_ok((void __user *)start, arg->len)))
>> + return -EINVAL;
>> +
>> + if (IS_GET_OP(arg) &&
>> + ((arg->vec_len == 0) || (!access_ok((struct page_region *)arg->vec, arg->vec_len))))
>> + return -ENOMEM;
>> +
>> + if (IS_SD_OP(arg) && ((arg->required_mask & PAGEMAP_NONSD_OP_MASK) ||
>> + (arg->anyof_mask & PAGEMAP_NONSD_OP_MASK)))
>> + return -EINVAL;
>> +
>> + end = start + arg->len;
>> + p.max_pages = arg->max_pages;
>> + p.found_pages = 0;
>> + p.flags = arg->flags;
>> + p.required_mask = arg->required_mask;
>> + p.anyof_mask = arg->anyof_mask;
>> + p.excluded_mask = arg->excluded_mask;
>> + p.return_mask = arg->return_mask;
>> + p.vec_index = 0;
>> + p.vec_len = arg->vec_len;
>> +
>> + if (IS_GET_OP(arg)) {
>> + p.vec = vzalloc(arg->vec_len * sizeof(struct page_region));
>> + if (!p.vec)
>> + return -ENOMEM;
>> + } else {
>> + p.vec = NULL;
>> + }
>
> Hi Muhammad! I'm really sorry for diving in such late (unfortunatelly too busy to
> step in yet). Anyway, while in general such interface looks reasonable here are
> few moments which really bothers me: as far as I undertstand you don't need
> vzalloc here, plain vmalloc should works as well since you copy only filled
> results back to userspace. Thank you for reviewing. Correct, I'll update to use vmalloc.

> Next -- there is no restriction on vec_len parameter,
> is not here a door for DoS from userspace? Say I could start a number of ioctl
> on same pagemap and try to allocate very big amount of vec_len in summay causing
> big pressure on kernel's memory. Or I miss something obvious here?
Yes, there is a chance that a large chunk of kernel memory can get
allocated here as vec_len can be very large. We need to think of limiting
this buffer in the current implementation. Any reasonable limit should
work. I'm not sure what would be the reasonable limit. Maybe couple of
hundred MBs? I'll think about it. Or I should update the implementation
such that less amount of intermediate buffer can be used like mincore does.
But this can complicate the implementation further as we are already using
page ranges instead of keeping just the flags. I'll see what can be done.

--
BR,
Muhammad Usama Anjum