Re: PSI poll() support for unprivileged users

From: Suren Baghdasaryan
Date: Fri Apr 24 2020 - 18:47:09 EST


On Fri, Apr 24, 2020 at 12:43 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Fri, Apr 24, 2020 at 8:39 AM Chris Down <chris@xxxxxxxxxxxxxx> wrote:
> >
> > Hi Suren,
>
> Hi Chris,
>
> >
> > I noticed that one restriction of the PSI poll() interface is that currently
> > only root can set up new triggers. Talking to Johannes, it seems the reason for
> > this was that you end up with a realtime kernel thread for every cgroup where a
> > trigger is set, and this could be used by unprivileged users to sap resources.
> >
>
> This reasoning is correct and IIRC the enforcement of this is just the
> way /proc/pressure files are created:
>
> proc_create("pressure/io", 0, NULL, &psi_io_fops);
> proc_create("pressure/memory", 0, NULL, &psi_memory_fops);
> proc_create("pressure/cpu", 0, NULL, &psi_cpu_fops);
>
> IOW there are no additional capability checks performed on the PSI
> trigger users.
>
> > I'm building a userspace daemon for desktop users which notifies based on
> > pressure events, and it's particularly janky to ask people to run such a
> > notifier as root: the notification mechanism is usually tied to the user's
> > display server auth, and the surrounding environment is generally pretty
> > important to maintain. In addition to this, just in general this doesn't feel
> > like the kind of feature that by its nature needs to be restricted to root --
> > it seems reasonable that there would be unprivileged users which want to use
> > this, and that not using RT threads would be acceptable in that scenario.
>
> For these cases you can provide a userspace privileged daemon that
> will relay pressure notifications to its unprivileged clients. This is
> what we do on Android - Android Management Server registers its PSI
> triggers and then relays low memory notifications to unprivileged
> apps.
> Another approach is taken by Android Low Memory Killer Daemon (lmkd)
> which is an unprivileged process but registers its PSI triggers. The
> trick is that the init process executes "chmod 0664
> /proc/pressure/memory" from its init script and further restrictions
> are enforced by selinux policy granting only LMKD write access to this
> file.
>
> Would any of these options work for you?
>
> > Have you considered making the per-cgroup RT threads optional? If the
> > processing isn't done in the FIFO kthread for unprivileged users, I think it
> > should be safe to allow them to write to pressure files (perhaps with some
> > additional limits or restrictions on things like the interval, as needed).
>
> I didn't consider that as I viewed memory condition tracking that
> consumes kernel resources as being potentially exploitable. RT threads
> did make that more of an issue but even without them I'm not sure we
> should allow unprivileged processes to create unlimited numbers of
> triggers each of which is not really free.

Thinking some more about this. LMKD in the above-mentioned usecase is
not a privileged process but it is granted access to PSI triggers by a
privileged init process+sepolicy and it needs RT threads to react to
memory pressure promptly without being preempted. If we allow only the
privileged users to have RT threads for PSI triggers then that
requirement would break this scenario and LMKD won't be able to use RT
threads.

>
> >
> > Thanks!
> >
> > Chris
>
> Thanks,
> Suren.