Re: [PACTH v1] mm, proc: Implement /proc/<pid>/totmaps

From: Jann Horn
Date: Tue Aug 09 2016 - 15:24:24 EST


On Tue, Aug 09, 2016 at 12:05:43PM -0400, robert.foss@xxxxxxxxxxxxx wrote:
> From: Sonny Rao <sonnyrao@xxxxxxxxxxxx>
>
> This is based on earlier work by Thiago Goncales. It implements a new
> per process proc file which summarizes the contents of the smaps file
> but doesn't display any addresses. It gives more detailed information
> than statm like the PSS (proprotional set size). It differs from the
> original implementation in that it doesn't use the full blown set of
> seq operations, uses a different termination condition, and doesn't
> displayed "Locked" as that was broken on the original implemenation.
>
> This new proc file provides information faster than parsing the potentially
> huge smaps file.
>
> Signed-off-by: Sonny Rao <sonnyrao@xxxxxxxxxxxx>
>
> Tested-by: Robert Foss <robert.foss@xxxxxxxxxxxxx>
> Signed-off-by: Robert Foss <robert.foss@xxxxxxxxxxxxx>


> +static int totmaps_proc_show(struct seq_file *m, void *data)
> +{
> + struct proc_maps_private *priv = m->private;
> + struct mm_struct *mm;
> + struct vm_area_struct *vma;
> + struct mem_size_stats *mss_sum = priv->mss;
> +
> + /* reference to priv->task already taken */
> + /* but need to get the mm here because */
> + /* task could be in the process of exiting */

Can you please elaborate on this? My understanding here is that you
intend for the caller to be able to repeatedly read the same totmaps
file with pread() and still see updated information after the target
process has called execve() and be able to detect process death
(instead of simply seeing stale values). Is that accurate?

I would prefer it if you could grab a reference to the mm_struct
directly at open time.


> + mm = get_task_mm(priv->task);
> + if (!mm || IS_ERR(mm))
> + return -EINVAL;

get_task_mm() doesn't return error codes, and all other callers just
check whether the return value is NULL.


> + down_read(&mm->mmap_sem);
> + hold_task_mempolicy(priv);
> +
> + for (vma = mm->mmap; vma != priv->tail_vma; vma = vma->vm_next) {
> + struct mem_size_stats mss;
> + struct mm_walk smaps_walk = {
> + .pmd_entry = smaps_pte_range,
> + .mm = vma->vm_mm,
> + .private = &mss,
> + };
> +
> + if (vma->vm_mm && !is_vm_hugetlb_page(vma)) {
> + memset(&mss, 0, sizeof(mss));
> + walk_page_vma(vma, &smaps_walk);
> + add_smaps_sum(&mss, mss_sum);
> + }
> + }

Errrr... what? You accumulate values from mem_size_stats items into a
struct mss_sum that is associated with the struct file? So when you
read the file the second time, you get the old values plus the new ones?
And when you read the file in parallel, you get inconsistent values?

For most files in procfs, the behavior is that you can just call
pread(fd, buf, sizeof(buf), 0) on the same fd again and again, giving
you the current values every time, without mutating state. I strongly
recommend that you get rid of priv->mss and just accumulate the state
in a local variable (maybe one on the stack).


> @@ -836,6 +911,50 @@ static int tid_smaps_open(struct inode *inode, struct file *file)
> return do_maps_open(inode, file, &proc_tid_smaps_op);
> }
>
> +static int totmaps_open(struct inode *inode, struct file *file)
> +{
> + struct proc_maps_private *priv;
> + int ret = -ENOMEM;
> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> + if (priv) {
> + priv->mss = kzalloc(sizeof(*priv->mss), GFP_KERNEL);
> + if (!priv->mss)
> + return -ENOMEM;

Memory leak: If the first allocation works and the second one doesn't, this
doesn't free the first allocation.

Please change this to use the typical goto pattern for error handling.

> +
> + /* we need to grab references to the task_struct */
> + /* at open time, because there's a potential information */
> + /* leak where the totmaps file is opened and held open */
> + /* while the underlying pid to task mapping changes */
> + /* underneath it */

Nit: That's not how comments are done in the kernel. Maybe change this to
a normal block comment instead of one block comment per line?

> + priv->task = get_pid_task(proc_pid(inode), PIDTYPE_PID);

`get_pid_task(proc_pid(inode), PIDTYPE_PID)` is exactly the definition
of get_proc_task(inode), maybe use that instead?

> + if (!priv->task) {
> + kfree(priv->mss);
> + kfree(priv);
> + return -ESRCH;
> + }
> +
> + ret = single_open(file, totmaps_proc_show, priv);
> + if (ret) {
> + put_task_struct(priv->task);
> + kfree(priv->mss);
> + kfree(priv);
> + }
> + }
> + return ret;
> +}

Please change this method to use the typical goto pattern for error
handling. IMO repeating the undo steps in all error cases makes
mistakes (like the one above) more likely and increases the amount
of redundant code.

Also: The smaps file is only accessible to callers with
PTRACE_MODE_READ privileges on the target task. Your thing doesn't
do any access checks, neither in the open handler nor in the read
handler. Can you give an analysis of why it's okay to expose this
data? As far as I can tell, without spending a lot of time thinking
about it, this kind of data looks like it might potentially be
useful for side-channel information leaks or so.

Attachment: signature.asc
Description: Digital signature