Re: [PATCH v2 04/11] drivers/base/devcoredump: convert devcd_count to counter_atomic32
From: Johannes Berg
Date: Wed Oct 07 2020 - 15:39:01 EST
On Wed, 2020-10-07 at 13:33 -0600, Shuah Khan wrote:
> On 10/7/20 12:15 PM, Kees Cook wrote:
> > On Tue, Oct 06, 2020 at 02:44:35PM -0600, Shuah Khan wrote:
> > > counter_atomic* is introduced to be used when a variable is used as
> > > a simple counter and doesn't guard object lifetimes. This clearly
> > > differentiates atomic_t usages that guard object lifetimes.
> > >
> > > counter_atomic* variables will wrap around to 0 when it overflows and
> > > should not be used to guard resource lifetimes, device usage and
> > > open counts that control state changes, and pm states.
> > >
> > > devcd_count is used to track dev_coredumpm device count and used in
> > > device name string. It doesn't guard object lifetimes, device usage
> > > counts, device open counts, and pm states. There is very little chance
> > > of this counter overflowing. Convert it to use counter_atomic32.
> > >
> > > This conversion doesn't change the overflow wrap around behavior.
> > >
> > > Reviewed-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > Signed-off-by: Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>
> >
> > I actually wonder if this should use refcount_t just because it is
> > designed to be an alway-unique value. It is hard to imagine ever causing
> > this to overflow, but why not let it be protected?
> >
>
> This is one of the cases where devcd_count doesn't guard lifetimes,
> however if it ever overflows, refcount_t is a better choice.
>
> If we decide refcount_t is a better choice, I can drop this patch
> and send refcount_t conversion patch instead.
>
> Greg! Any thoughts on refcount_t for this being a better choice?
I'm not Greg, but ... there's a 5 minute timeout. So in order to cause a
clash you'd have to manage to overflow the counter within a 5 minute
interval, otherwise you can actually reuse the numbers starting again
from 0 without any ill effect.
And even if you *do* manage to overflow it quickly enough it'll just
fail device_add() and error out, and nothing happens.
So I think it's fairly much pointless to think about protecting against
some kind of overflows. It's just trying to get a "temporarily unique
ID" here, could be doing anything else instead, but most other things
would require bigger data structures and/or (higher level) locking.
OTOH, if you *do* somehow create that many core dumps (huge uptimes and
extremely frequent crashes?) it seems like refcount_t would be a bad
choice because it saturates, and then you can only do one more dump per
5 minutes? Or maybe that's a good thing in these ill cases ...
I don't think it'll really happen either way :)
johannes