Re: [PATCH] devcoredump : Serialize devcd_del work

From: Johannes Berg
Date: Fri Apr 22 2022 - 09:53:42 EST


On Fri, 2022-04-22 at 15:41 +0200, Johannes Berg wrote:
> On Tue, 2022-04-19 at 15:57 +0530, Mukesh Ojha wrote:
> > In following scenario(diagram), when one thread X running dev_coredumpm() adds devcd
> > device to the framework which sends uevent notification to userspace
> > and another thread Y reads this uevent and call to devcd_data_write()
> > which eventually try to delete the queued timer that is not initialized/queued yet.
> >
> > So, debug object reports some warning and in the meantime, timer is initialized
> > and queued from X path. and from Y path, it gets reinitialized again and
> > timer->entry.pprev=NULL and try_to_grab_pending() stucks.
> >
> > To fix this, introduce mutex to serialize the behaviour.
> >
> > cpu0(X) cpu1(Y)
> >
> > dev_coredump() uevent sent to userspace
> > device_add() =========================> userspace process Y reads the uevents
> > writes to devcd fd which
> > results into writes to
> >
> > devcd_data_write()
> > mod_delayed_work()
> > try_to_grab_pending()
> > del_timer()
> > debug_assert_init()
> > INIT_DELAYED_WORK
> > schedule_delayed_work
> >
>
> Wouldn't it be easier to simply schedule this before adding it to sysfs
> and sending the uevent?
>

Hm. I think that would solve this problem, but not all of the problems
here ...

Even with your change, I believe it's still racy wrt. disabled_store(),
since that flushes the work but devcd_data_write() remains reachable
(and might in fact be waiting for the mutex after your change), so I
think we need an additional flag somewhere (in addition to the mutex) to
serialize all of these things against each other.

johannes