Re: [BUG] kernel: rcu: a possible sleep-in-atomic-context bug in srcu_read_delay()

From: Paul E. McKenney
Date: Mon Aug 13 2018 - 08:42:40 EST


On Mon, Aug 13, 2018 at 05:26:49PM +0800, Jia-Ju Bai wrote:
>
>
> On 2018/8/13 12:18, Paul E. McKenney wrote:
> >On Mon, Aug 13, 2018 at 11:04:10AM +0800, Jia-Ju Bai wrote:
> >>The kernel may sleep with holding a spinlock.
> >>
> >>The function call paths (from bottom to top) in Linux-4.16 are:
> >>
> >>[FUNC] schedule_timeout_interruptible
> >>kernel/rcu/rcutorture.c, 523: schedule_timeout_interruptible in
> >>srcu_read_delay
> >>kernel/rcu/rcutorture.c, 1105: [FUNC_PTR]srcu_read_delay in
> >>rcu_torture_timer
> >>kernel/rcu/rcutorture.c, 1104: spin_lock in rcu_torture_timer
> >>
> >>Note that [FUNC_PTR] means a function pointer call is used.
> >>
> >>I do not find a good way to fix, so I only report.
> >>This is found by my static analysis tool (DSAC).
> >Interesting. I would have expected to have gotten a "scheduling while
> >atomic" error message, which I do not recall seeing. And I ran a great
> >deal of rcutorture on v4.16.
> >
> >So let's see... As you say, the rcu_torture_timer() function does in
> >fact acquire rand_lock in 4.16 and 4.17, in which case sleeping would
> >indeed be illegal. But let's take a look at srcu_read_delay():
> >
> >static void
> >srcu_read_delay(struct torture_random_state *rrsp, struct rt_read_seg *rtrsp)
> >{
> > long delay;
> > const long uspertick = 1000000 / HZ;
> > const long longdelay = 10;
> >
> > /* We want there to be long-running readers, but not all the time. */
> >
> > delay = torture_random(rrsp) %
> > (nrealreaders * 2 * longdelay * uspertick);
> > if (!delay && in_task()) {
> > schedule_timeout_interruptible(longdelay);
> > rtrsp->rt_delay_jiffies = longdelay;
> > } else {
> > rcu_read_delay(rrsp, rtrsp);
> > }
> >}
> >
> >The call to schedule_timeout_interruptible() cannot happen unless the
> >in_task() macro returns true, which it won't if the SOFTIRQ_OFFSET bit
> >is set:
> >
> >#define in_task() (!(preempt_count() & \
> > (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))
> >
> >And the SOFTIRQ_OFFSET bit will be set if srcu_read_delay()
> >is invoked from a timer handler, which is the case for the
> >call from rcu_torture_timer(). So if that lock is held,
> >schedule_timeout_interruptible() won't ever be invoked.
>
> Thanks for your reply :)
> My tool does not track this bit...
> Sorry for this false report.

Not a problem, a few false positives are to be expected. And it looks
like you have some work to do on your tool -- which is good, because I
would not want you to be bored. ;-)

Thanx, Paul