Re: 2.6.21-mm1 hwsusp: BUG at workqueue.c:106

From: Oleg Nesterov
Date: Tue May 08 2007 - 10:21:29 EST


On 05/08, Oleg Nesterov wrote:
>
> On 05/08, Jiri Slaby wrote:
> >
> > vmstat_update+0x0/0x2b
>
> Thanks a lot.
>
> Right now,
>
> > +static void vmstat_update(struct work_struct *w)
> > +{
> > + refresh_cpu_vm_stats(smp_processor_id());
> > + schedule_delayed_work(&__get_cpu_var(vmstat_work),
> > + sysctl_stat_interval);
> > +}
>
> This is not precisely correct. We cam schedule the wrong vmstat_work
> if this timer/work migrates to another CPU. I'd suggest
>
> schedule_delayed_work(container_of(w, struct delayed_work, work))
>
> This should not happen because we are doing cancel_rearming_delayed_work()
> below, however:
>
> > + case CPU_DOWN_PREPARE:
> > + case CPU_DOWN_PREPARE_FROZEN:
> > + cancel_rearming_delayed_work(&per_cpu(vmstat_work, cpu));
> > + per_cpu(vmstat_work, cpu).work.func = NULL;
> > + case CPU_DOWN_FAILED:
> > + case CPU_DOWN_FAILED_FROZEN:
> > + start_cpu_timer(cpu);
>
> we need a "break;" before "case CPU_DOWN_FAILED", otherwise we re-start
> vmstat_update() immediately.
>
> This is a bug, but I am not sure is this the only problem.

In case I was not clear, this _can_ explain the problem. Because an extra
start_cpu_timer() (due to missed "break;") re-initializes dwork, and clears
_PENDING.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/