Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED

From: Linus Torvalds
Date: Mon Sep 11 2023 - 20:26:52 EST


On Mon, 11 Sept 2023 at 09:48, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> I wonder if we should make it a rule to not allow page faults when
> RESCHED_ALLOW is set?

I really think that user copies might actually be one of the prime targets.

Right now we special-case big user copes - see for example
copy_chunked_from_user().

But that's an example of exactly the problem this code has - we
literally make more complex - and objectively *WORSE* code just to
deal with "I want this to be interruptible".

So yes, we could limit RESCHED_ALLOW to not allow page faults, but big
user copies literally are one of the worst problems.

Another example of this this is just plain read/write. It's not a
problem in practice right now, because large pages are effectively
never used.

But just imagine what happens once filemap_read() actually does big folios?

Do you really want this code:

copied = copy_folio_to_iter(folio, offset, bytes, iter);

to forever use the artificial chunking it does now?

And yes, right now it will still do things in one-page chunks in
copy_page_to_iter(). It doesn't even have cond_resched() - it's
currently in the caller, in filemap_read().

But just think about possible futures.

Now, one option really is to do what I think PeterZ kind of alluded to
- start deprecating PREEMPT_VOLUNTARY and PREEMPT_NONE entirely.

Except we've actually been *adding* to this whole mess, rather than
removing it. So we have actively *expanded* on that preemption choice
with PREEMPT_DYNAMIC.

That's actually reasonably recent, implying that distros really want
to still have the option.

And it seems like it's actually server people who want the "no
preemption" (and presumably avoid all the preempt count stuff entirely
- it's not necessarily the *preemption* that is the cost, it's the
incessant preempt count updates)

Linus