Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t

From: Eric W. Biederman
Date: Mon Jul 10 2017 - 07:34:13 EST


"Reshetova, Elena" <elena.reshetova@xxxxxxxxx> writes:

>> "Reshetova, Elena" <elena.reshetova@xxxxxxxxx> writes:
>>
>> 2>> Elena Reshetova <elena.reshetova@xxxxxxxxx> writes:
>> >>
>> >> > refcount_t type and corresponding API should be
>> >> > used instead of atomic_t when the variable is used as
>> >> > a reference counter. This allows to avoid accidental
>> >> > refcounter overflows that might lead to use-after-free
>> >> > situations.
>> >>
>> >> In this patch you can see all of the uses of the count.
>> >> What accidental refcount overflows are possible?
>> >
>> > Even if one can guarantee and prove that in the current implementation
>> > there are no overflows possible, we can't say that for
>> > sure for any future implementation. Bugs might always happen
>> > unfortunately, but if we convert the refcounter to a safer
>> > type we can be sure that overflows are not possible.
>> >
>> > Does it make sense to you?
>>
>> Not for code that is likely to remain unchanged for a decade no.
>
> Can we really be sure for any kernel code about this? And does it make
> sense to trust our security on a fact like this?

But refcount_t doesn't fix anything. At best it changes a bad bug to a
less bad bug. So now my machine OOMS instead of allows a memory
overwrite. It still doesn't work.

Plus refcount_t does not provide any safety on the architectures where
it is a noop.

>> This looks like a large set of unautomated changes without any real
>> thought put into it.
>
> We are soon into the end of the first year that we started to look into
> refcounter overflow/underflow problem and coming up this far was
> not easy enough (just check all the millions of emails on kernel-hardening
> mailing list). Each refcount_t conversion candidate was first found by Coccinelle
> analysis and then manually checked and converted. The story of
> refcount_t API and all discussions go even further.
> So you can't really claim that there is no " thought put into it " :)

But the conversion of the instance happens without thought and manually.
Which is a good recipe for typos. Which is what I am saying.

There have been lots of conversions like that in the kernel and
practically every one has introduced at least one typo.

So from an engineering standpoint it is a very valid question to ask
about. And I find the apparent insistence that you don't make typos
very disturbing.

> That almost always results in a typo somewhere
>> that breaks things.
>>
>> So there is no benefit to the code, and a non-zero chance that there
>> will be a typo breaking the code.
>
> The code is very active on issuing WARNs when anything goes wrong.
> Using this feature we have not only found errors in conversions, but
> sometimes errors in code itself. So, any bug would be actually much
> faster visible than using old atomic_t interface.
>
> In addition by default refcount_t equals to atomic, which also gives a
> possibility to make a softer transition and catch all related bugs in couple
> of cycles when enabling CONFIG_REFCOUNT_FULL.

But if you make a typo and change one operation for another I don't see
how any of that applies.

And that is what it looks like I we are looking at here.

Eric