Re: [PATCH 2/4 v0.5] sched/umcg: RFC: add userspace atomic helpers
From: Peter Oskolkov
Date: Thu Sep 09 2021 - 15:07:16 EST
On Wed, Sep 8, 2021 at 4:39 PM Jann Horn <jannh@xxxxxxxxxx> wrote:
Thanks a lot for the reviews, Jann!
I understand how to address most of your comments. However, one issue
I'm not sure what to do about:
[...]
> If this function is not allowed to sleep, as the comment says...
[...]
> ... then I'm pretty sure you can't call fix_pagefault() here, which
> acquires the mmap semaphore (which may involve sleeping) and then goes
> through the pagefault handling path (which can also sleep for various
> reasons, like allocating memory for pagetables, loading pages from
> disk / NFS / FUSE, and so on).
<quote from peterz@ from
https://lore.kernel.org/lkml/20210609125435.GA68187@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/>:
So a PF_UMCG_WORKER would be added to sched_submit_work()'s PF_*_WORKER
path to capture these tasks blocking. The umcg_sleeping() hook added
there would:
put_user(BLOCKED, umcg_task->umcg_status);
...
</quote>
Which is basically what I am doing here: in sched_submit_work() I need
to read/write to userspace; and we cannot sleep in
sched_submit_work(), I believe.
If you are right that it is impossible to deal with pagefaults from
within non-sleepable contexts, I see two options:
Option 1: as you suggest, pin pages holding struct umcg_task in sys_umcg_ctl;
or
Option 2: add more umcg-related kernel state to task_struct so that
reading/writing to userspace is not necessary in sched_submit_work().
The first option sounds much better from the code simplicity point of
view, but I'm not sure if it is a viable approach, i.e. I'm afraid
we'll get a hard NACK here, as a non-privileged process will be able
to force the kernel to pin a page per task/thread. We may get around
it by first pinning a limited number of pages, then having the
userspace allocate structs umcg_task on those pages, so that a pinned
page would cover more than a single task/thread. And have a sysctl
that limits the number of pinned pages per MM.
Peter Z., could you, please, comment here? Do you think pinning pages
to hold structs umcg_task is acceptable?
[...]