Re: [PACTH v1] mm, proc: Implement /proc/<pid>/totmaps

From: Sonny Rao
Date: Tue Aug 09 2016 - 12:57:07 EST


On Tue, Aug 9, 2016 at 9:29 AM, Mateusz Guzik <mguzik@xxxxxxxxxx> wrote:
> On Tue, Aug 09, 2016 at 12:05:43PM -0400, robert.foss@xxxxxxxxxxxxx wrote:
>> From: Sonny Rao <sonnyrao@xxxxxxxxxxxx>
>>
>> This is based on earlier work by Thiago Goncales. It implements a new
>> per process proc file which summarizes the contents of the smaps file
>> but doesn't display any addresses. It gives more detailed information
>> than statm like the PSS (proprotional set size). It differs from the
>> original implementation in that it doesn't use the full blown set of
>> seq operations, uses a different termination condition, and doesn't
>> displayed "Locked" as that was broken on the original implemenation.
>>
>> This new proc file provides information faster than parsing the potentially
>> huge smaps file.
>
> I have no idea about usefulness of this.

I can comment about this. The use case is to speed up monitoring of
memory consumption in environments where RSS isn't precise.

For example Chrome tends to many processes which have hundreds of VMAs
with a substantial amount of shared memory, and the error of using
RSS rather than PSS tends to be very large when looking at overall
memory consumption. PSS isn't kept as a single number that's exported
like RSS, so to calculate PSS means having to parse a very large smaps
file.

This process is slow and has to be repeated for many processes, and we
found that the just act of doing the parsing was taking up a
significant amount of CPU time, so this patch is an attempt to make
that process cheaper.

>
> The patch is definitely buggy with respect to how it implements actual
> access to mm.
>
>> +static int totmaps_proc_show(struct seq_file *m, void *data)
>> +{
>> + struct proc_maps_private *priv = m->private;
>> + struct mm_struct *mm;
>> + struct vm_area_struct *vma;
>> + struct mem_size_stats *mss_sum = priv->mss;
>> +
>> + /* reference to priv->task already taken */
>> + /* but need to get the mm here because */
>> + /* task could be in the process of exiting */
>> + mm = get_task_mm(priv->task);
>> + if (!mm || IS_ERR(mm))
>> + return -EINVAL;
>> +
>
> That's not how it's done in smaps.
>
>> +static int totmaps_open(struct inode *inode, struct file *file)
>> +{
>> + struct proc_maps_private *priv;
>> + int ret = -ENOMEM;
>> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>> + if (priv) {
>> + priv->mss = kzalloc(sizeof(*priv->mss), GFP_KERNEL);
>> + if (!priv->mss)
>> + return -ENOMEM;
>
> Cases below explicitly kfree(priv). I can't remember whether the close
> routine gets called if this one fails. Either way, something is wrong
> here.
>
>> +
>> + /* we need to grab references to the task_struct */
>> + /* at open time, because there's a potential information */
>> + /* leak where the totmaps file is opened and held open */
>> + /* while the underlying pid to task mapping changes */
>> + /* underneath it */
>> + priv->task = get_pid_task(proc_pid(inode), PIDTYPE_PID);
>
> This performs no permission checks that I would see. If you take a look
> at smaps you will see the user ends up in proc_maps_open which performs
> proc_mem_open(inode, PTRACE_MODE_READ) and gets a mm from there.
>
>
>> + if (!priv->task) {
>> + kfree(priv->mss);
>> + kfree(priv);
>> + return -ESRCH;
>> + }
>> +
>> + ret = single_open(file, totmaps_proc_show, priv);
>> + if (ret) {
>> + put_task_struct(priv->task);
>> + kfree(priv->mss);
>> + kfree(priv);
>> + }
>> + }
>> + return ret;
>> +}
>> +
>
> --
> Mateusz Guzik