Re: [PATCH v2] creds: Convert cred.usage to refcount_t
From: Jann Horn
Date: Fri Aug 18 2023 - 16:55:48 EST
On Fri, Aug 18, 2023 at 9:31 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, 18 Aug 2023 11:48:16 -0700 Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
> > On Fri, Aug 18, 2023 at 08:17:55PM +0200, Jann Horn wrote:
> > > Though really we don't *just* need refcount_t to catch bugs; on a
> > > system with enough RAM you can also overflow many 32-bit refcounts by
> > > simply creating 2^32 actual references to an object. Depending on the
> > > structure of objects that hold such refcounts, that can start
> > > happening at around 2^32 * 8 bytes = 32 GiB memory usage, and it
> > > becomes increasingly practical to do this with more objects if you
> > > have significantly more RAM. I suppose you could avoid such issues by
> > > putting a hard limit of 32 GiB on the amount of slab memory and
> > > requiring that kernel object references are stored as pointers in slab
> > > memory, or by making all the refcounts 64-bit.
> >
> > These problems are a different issue, and yes, the path out of it would
> > be to crank the size of refcount_t, etc.
>
> Is it possible for such overflows to occur in the cred code? If so,
> that's a bug. Can we fix that cred bug without all this overhead?
Dunno, probably depends on how much RAM you have and how the system is
configured? Like, it should get pretty easy to hit if you have around
44 TB of RAM, since I think the kernel will let you create around 2^32
instances of "struct file" at that point, and each file holds a
reference to the creator's "struct cred". If RLIMIT_NOFILE and
/proc/sys/kernel/pid_max are high enough, you could probably store
2^32 files in file descriptor table entries, spread out over a few ten
thousand processes but all pointing to the same struct cred, and
trigger an overflow of a cred refcount that way. But I haven't tried
that and there might be some other limit that prevents this somewhere.
If you have less RAM, you'd have to try harder to find some data
structure where the kernel doesn't impose such strict limits on
allocation as for files. io_uring requests can carry references to
creds, and I think you can probably make them block infinitely through
dependencies; I don't know how many io_uring requests you could have
in flight at a time. Eyeballing the io_uring code, it looks like this
might work at somewhere around 1 TB of slab memory usage if there
isn't some limit somewhere?
My point is that it's really hard to figure out how many references
you can have to an object that can have references from all over the
kernel unless there is a hard cap on the amount of memory in which
such references are stored or you're able to just refuse incrementing
the refcount when it gets too high. And so in my opinion it makes
sense to use a refcount type that is able to warn and (depending on
configuration) continue execution safely (except for leaking a little
bit of memory) even if it reaches its limit.