Re: Possible race in dev_coredumpm()-del_timer() path
From: Mukesh Ojha
Date: Wed Apr 13 2022 - 06:16:50 EST
On Wed, Apr 13, 2022 at 07:34:24AM +0200, Greg KH wrote:
> On Wed, Apr 13, 2022 at 10:59:22AM +0530, Mukesh Ojha wrote:
> > Hi All,
> >
> > We are hitting one race due to which try_to_grab_pending() is stuck .
>
> What kernel version are you using?
5.10
Sorry, for the formatting mess.
> > In following scenario, while running (p1)dev_coredumpm() devcd device is
> > added to
> > the framework and uevent notification sent to userspace that result in the
> > call to (p2) devcd_data_write()
> > which eventually try to delete the queued timer which in the racy scenario
> > timer is not queued yet.
> > So, debug object report some warning and in the meantime timer is
> > initialized and queued from p1 path.
> > and from p2 path it gets overriden again timer->entry.pprev=NULL and
> > try_to_grab_pending() stuck
p1 p2(X)
dev_coredump() uevent sent to userspace
device_add() =========================> userspace process X reads the uevents
writes to devcd fd which
results into writes to
devcd_data_write()
mod_delayed_work()
try_to_grab_pending()
del_timer()
debug_assert_init()
INIT_DELAYED_WORK
schedule_delayed_work
debug_object_fixup()
timer_fixup_assert_init()
timer_setup()
do_init_timer() ==> reinitialized the timer to timer->entry.pprev=NULL
timer_pending()
!hlist_unhashed_lockless(&timer->entry)
!h->pprev ==> del_timer checks and finds to be NULL
stuck in try_to_grab_pending
Thanks,Mukesh
>
>
> > In following scenario, while running (p1)dev_coredumpm() devcd device is
> > added to
> > the framework and uevent notification sent to userspace that result in the
> > call to (p2) devcd_data_write()
> > which eventually try to delete the queued timer which in the racy scenario
> > timer is not queued yet.
> > So, debug object report some warning and in the meantime timer is
> > initialized and queued from p1 path.
> > and from p2 path it gets overriden again timer->entry.pprev=NULL and
> > try_to_grab_pending() stuck
> > as del_timer() always return 0 as timer_pending() return false.
> >
> > P1 P2(X)
> >
> >
> > dev_coredumpm()
> >
> > Uevent notification sent to
> > userspace
> > for device addition
> >
> > device_add() ========================> Process X
> > reads this uevents
> > notification and do write call
> > that results in call to
> >
> > devcd_data_write()
> > mod_delayed_work()
> > try_to_grab_pending()
> > del_timer()
> > debug_assert_init()
> >
> > INIT_DELAYED_WORK
> > (&devcd->del_wk, devcd_del);
> > schedule_delayed_work(&devcd->del_wk,
> > DEVCD_TIMEOUT);
> >
> > debug_object_fixup()
> > timer_fixup_assert_init()
> > timer_setup()
> > do_init_timer() ==> reinitialized the timer to timer->entry.pprev=NULL
> >
> > timer_pending()
> > !hlist_unhashed_lockless(&timer->entry)
> > !h->pprev
>
> The above is confusing and not able to be understood due to the
> formatting mess. Care to fix this up and resend?
>
> thanks,
>
> greg k-h