Re: [PATCH v2 2/2] pidmap(2)
From: Alexey Dobriyan
Date: Tue Sep 26 2017 - 14:46:56 EST
On Sun, Sep 24, 2017 at 02:27:00PM -0700, Andy Lutomirski wrote:
> On Sun, Sep 24, 2017 at 1:08 PM, Alexey Dobriyan <adobriyan@xxxxxxxxx> wrote:
> > From: Tatsiana Brouka <Tatsiana_Brouka@xxxxxxxx>
> >
> > Implement system call for bulk retrieveing of pids in binary form.
> >
> > Using /proc is slower than necessary: 3 syscalls + another 3 for each thread +
> > converting with atoi() + instantiating dentries and inodes.
> >
> > /proc may be not mounted especially in containers. Natural extension of
> > hidepid=2 efforts is to not mount /proc at all.
> >
> > It could be used by programs like ps, top or CRIU. Speed increase will
> > become more drastic once combined with bulk retrieval of process statistics.
> >
> > Benchmark:
> >
> > N=1<<16 times
> > ~130 processes (~250 task_structs) on a regular desktop system
> > opendir + readdir + closedir /proc + the same for every /proc/$PID/task
> > (roughly what htop(1) does) vs pidmap
> >
> > /proc 16.80 ± 0.73%
> > pidmap 0.06 ± 0.31%
> >
> > PIDMAP_* flags are modelled after /proc/task_diag patchset.
> >
> >
> > PIDMAP(2) Linux Programmer's Manual PIDMAP(2)
> >
> > NAME
> > pidmap - get allocated PIDs
> >
> > SYNOPSIS
> > long pidmap(pid_t pid, int *pids, unsigned int count , unsigned int start, int flags);
>
> I think we will seriously regret a syscall that does this. Djalal is
> working on fixing the turd that is hidepid, and this syscall is
> basically incompatible with ever fixing hidepids. I think that, to
> make it less regrettable, it needs to take an fd to a proc mount as a
> parameter. This makes me wonder why it's a syscall at all -- why not
> just create a new file like /proc/pids?
See reply to fdmap(2).
pidmap(2) is indeed more complex case exactly because of
pid/tgid/tid/everything else + pidnamespaces + ->hide_pid.
However the problem remains: query task tree without all the bullshit.
C/R people succumbed with /proc/*/children, it was a mistake IMO.