Re: [PATCH v2 04/11] drivers/base/devcoredump: convert devcd_count to counter_atomic32

From: Shuah Khan
Date: Wed Oct 07 2020 - 15:59:15 EST


On 10/7/20 1:38 PM, Johannes Berg wrote:
On Wed, 2020-10-07 at 13:33 -0600, Shuah Khan wrote:
On 10/7/20 12:15 PM, Kees Cook wrote:
On Tue, Oct 06, 2020 at 02:44:35PM -0600, Shuah Khan wrote:
counter_atomic* is introduced to be used when a variable is used as
a simple counter and doesn't guard object lifetimes. This clearly
differentiates atomic_t usages that guard object lifetimes.

counter_atomic* variables will wrap around to 0 when it overflows and
should not be used to guard resource lifetimes, device usage and
open counts that control state changes, and pm states.

devcd_count is used to track dev_coredumpm device count and used in
device name string. It doesn't guard object lifetimes, device usage
counts, device open counts, and pm states. There is very little chance
of this counter overflowing. Convert it to use counter_atomic32.

This conversion doesn't change the overflow wrap around behavior.

Reviewed-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>

I actually wonder if this should use refcount_t just because it is
designed to be an alway-unique value. It is hard to imagine ever causing
this to overflow, but why not let it be protected?


This is one of the cases where devcd_count doesn't guard lifetimes,
however if it ever overflows, refcount_t is a better choice.

If we decide refcount_t is a better choice, I can drop this patch
and send refcount_t conversion patch instead.

Greg! Any thoughts on refcount_t for this being a better choice?

I'm not Greg, but ... there's a 5 minute timeout. So in order to cause a
clash you'd have to manage to overflow the counter within a 5 minute
interval, otherwise you can actually reuse the numbers starting again
from 0 without any ill effect.

And even if you *do* manage to overflow it quickly enough it'll just
fail device_add() and error out, and nothing happens.

So I think it's fairly much pointless to think about protecting against
some kind of overflows. It's just trying to get a "temporarily unique
ID" here, could be doing anything else instead, but most other things
would require bigger data structures and/or (higher level) locking.

OTOH, if you *do* somehow create that many core dumps (huge uptimes and
extremely frequent crashes?) it seems like refcount_t would be a bad
choice because it saturates, and then you can only do one more dump per
5 minutes? Or maybe that's a good thing in these ill cases ...

I don't think it'll really happen either way :)


I didn't think this could overflow and if it does we might have other
problems.

Thank you taking the time for this detailed analysis. This clarifies the
"very little chance of this counter overflowing and no ill effects".

thanks,
-- Shuah