Re: [GIT PULL] ucounts: Count rlimits in each user namespace

From: Eric W. Biederman
Date: Thu Jul 01 2021 - 16:06:00 EST


Alexey Gladkov <legion@xxxxxxxxxx> writes:

> On Tue, Jun 29, 2021 at 12:09:01PM -0500, Eric W. Biederman wrote:
>> ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:
>>
>> > Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>> >
>> >> On Tue, Jun 29, 2021 at 8:52 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>> >>>
>> >>> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>> >>>
>> >>> > Why the "sigpending < LONG_MAX" test in that
>> >>> >
>> >>> > if (override_rlimit || (sigpending < LONG_MAX && sigpending <=
>> >>> > task_rlimit(t, RLIMIT_SIGPENDING))) {
>> >>> > thing?
>> >>>
>> >>> On second look that sigpending < LONG_MAX check is necessary. When
>> >>> inc_rlimit_ucounts detects a problem it returns LONG_MAX.
>> >>
>> >> I saw that, but _without_ that test you'd be left with just that
>> >>
>> >> sigpending <= task_rlimit(t, RLIMIT_SIGPENDING)
>> >>
>> >> and if task_rlimit() is LONG_MAX, then that means "no limits", so it is all ok.
>> >
>> > It means no limits locally. The creator of your user namespace might
>> > have had a limit which you are also bound by.
>> >
>> > The other possibility is that inc_rlimits_ucounts caused a sigpending
>> > counter to overflow. In which case we need to fail and run
>> > dec_rlimit_ucounts to keep the counter from staying overflowed.
>> >
>> > So I don't see a clever way to avoid the sigpending < LONG_MAX test.
>>
>> Hmm. I take that back. There is a simple clever way to satisfy all of
>> the tests.
>>
>> - sigpending < LONG_MAX && sigpending <= task_rlimit(t, RLIMIT_SIGPENDING)
>> + sigpending < task_rlimit(t, RLIMIT_SIGPENDING)
>>
>> That would just need a small comment to explain the subtleties.
>
> Is it because user.sigpending was atomic_t before this patch ?

Apologies I was wrong.

The replacement of "<=" with "<" is correct for the case where
"task_rlimit(t, RLIMIT_SIGPENDING) == LONG_MAX".

Unfortunately off by one for all other values of
"task_rlimit(t, RLIMIT_SIGPENDING)".

It completely breaks things for the case where RLIMIT_SIGPENDING == 1,
where no signals are allowed to be queued. Today allowing 1 queued
signal with a single task and a sender that does not send a second
signal until the first is consumed will work reliably.

That was just a brain fart on my part.

Eric