Re: [kernel-hardening] Re: Add overflow protection to kref

From: Kees Cook
Date: Fri Feb 24 2012 - 15:04:12 EST


On Fri, Feb 24, 2012 at 11:41 AM, Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Feb 24, 2012 at 10:58:25PM +0400, Vasiliy Kulikov wrote:
>> On Fri, Feb 24, 2012 at 10:37 -0800, Greg KH wrote:
>> > On Fri, Feb 24, 2012 at 12:58:35PM -0500, David Windsor wrote:
>> > >  static inline void kref_get(struct kref *kref)
>> > >  {
>> > > +   int rc = 0;
>> > >     WARN_ON(!atomic_read(&kref->refcount));
>> > > -   atomic_inc(&kref->refcount);
>> > > +   smp_mb__before_atomic_inc();
>> > > +   rc = atomic_add_unless(&kref->refcount, 1, INT_MAX);
>> > > +   smp_mb__after_atomic_inc();
>> > > +   BUG_ON(!rc);
>> >
>> > So you are guaranteeing to crash a machine here if this fails?  And you
>> > were trying to say this is a "security" based fix?
>> >
>> > And people wonder why I no longer have any hair...
>>
>> If a refcounter overflows there is NO WAY to recover.  The choise is to BUG()
>> and not allow any security harm to the system (privilege escalation, etc.)
>> or to try to do some more CPU cycles until actual use after free, privilege
>> escalation, etc.  The former is a _guarantee_ that nothing bad (in security
>> sense) doesn't happen.  The latter is an opportunistic approach, which
>> doesn't work with security.
>
> The only way you could legimitaly get a real use-after-free problem if
> you were overflowing the reference counter and pegged it at the max
> value, was if you had code that could decrement the reference count as
> many times as you incremented it.  So far, all bugs we've seen are
> one-off where on an error path, we forgot to decrement the count.  So
> how could the decrement ever happen?

Based on what I've seen, the "normal" exploit follows this pattern:

user1: alloc(), inc
user2: inc
user2: fail to dec
*repeat user2's actions until wrap*
user3: inc
user3: dec, free()
user1: operate on freed memory zomg

We could avoid the BUG by locking the counter to INT_MAX if it ever
reaches it (i.e. the put path would need to be modified too), and then
a WARN could be added to the get. Is that what was being suggested as
the alternative to the BUG patch?

>> Do you claim that a refcounter overflow is a recoverable state?  I'd want to
>> know what you can do with it.
>
> I'm not saying it is a "recoverable" state, but to crash the machine is
> not acceptable.  At the very least, let the user know something went
> wrong, and stick around long enough to let them know and do something,
> before shutting the thing down.
>
> But before people start micro-engineering this whole thing, remember,
> I'm still not sold on this type of change at all.
>
> greg k-h
>
> p.s. Has anyone ever tried an endless open() loop on a sysfs file to see
>     what happens today?...

AIUI, you'd hit nrfile well before wrapping the counter. To test this,
we'd need to simulate a failed decrement.

-Kees

--
Kees Cook
ChromeOS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/