Re: linux 5.14.3: free_user_ns causes NULL pointer dereference

From: Eric W. Biederman
Date: Fri Sep 17 2021 - 12:17:01 EST


Yu Zhao <yuzhao@xxxxxxxxxx> writes:

> On Wed, Sep 15, 2021 at 4:42 PM Jordan Glover
> <Golden_Miller83@xxxxxxxxxxxxx> wrote:
>>
>> On Wednesday, September 15th, 2021 at 9:02 PM, <ebiederm@xxxxxxxxxxxx> wrote:
>>
>> > Jordan Glover Golden_Miller83@xxxxxxxxxxxxx writes:
>> >
>> > > Hi, recently I hit system freeze after I was closing few containerized apps on my system. As for now it occurred only once on linux 5.14.3. I think it maybe be related to "Count rlimits in each user namespace" patchset merged during 5.14 window
>> > >
>> > > https://lore.kernel.org/all/257aa5fb1a7d81cf0f4c34f39ada2320c4284771.1619094428.git.legion@xxxxxxxxxx/T/#u
>> >
>> > So that warning comes from:
>> >
>> > void dec_ucount(struct ucounts *ucounts, enum ucount_type type)
>> >
>> > {
>> >
>> > struct ucounts *iter;
>> >
>> > for (iter = ucounts; iter; iter = iter->ns->ucounts) {
>> >
>> > long dec = atomic_long_dec_if_positive(&iter->ucount[type]);
>> >
>> > WARN_ON_ONCE(dec < 0);
>> > }
>> > put_ucounts(ucounts);
>> >
>> >
>> > }
>> >
>> > Which certainly looks like a reference count bug. It could also be a
>> >
>> > memory stomp somewhere close.
>> >
>> > Do you have any idea what else was going on? This location is the
>> >
>> > symptom but not the actual cause.
>> >
>> > Eric
>>
>> I had about 2 containerized (flatpak/bubblewrap) apps (browser + music player) running . I quickly closed them with intent to shutdown the system but instead get the freeze and had to use magic sysrq to reboot. System logs end with what I posted and before there is nothing suspicious.
>>
>> Maybe it's some random fluke. I'll reply if I hit it again.
>
> I have been able to steadily reproduce this for a while. But I haven't
> had time to look into it. I'd appreciate any help.

It would be very helpful if you could look farther back in your logs and
see if you can also see:
WARNING: CPU: 1 PID: 351 at kernel/ucount.c:253 dec_ucount+0x43/0x5

Or anything else preceding the use-after-free.

I am inclined to think they are the same issue but without seeing the
WARN_ON_ONCE I can't safely conclude that.

Eric