Re: Crashes with 874bbfe600a6 in 3.18.25

From: Mike Galbraith
Date: Wed Feb 03 2016 - 12:13:28 EST


On Wed, 2016-02-03 at 12:06 -0500, Tejun Heo wrote:
> On Wed, Feb 03, 2016 at 06:01:53PM +0100, Mike Galbraith wrote:
> > Hm, so it's ok to queue work to an offline CPU? What happens if it
> > doesn't come back for an eternity or two?
>
> Right now, it just loses affinity. A more interesting case is a cpu
> going offline whlie work items bound to the cpu are still running and
> the root problem is that we've never distinguished between affinity
> for correctness and optimization and thus can't flush or warn on the
> stagglers. The plan is to ensure that all correctness users specify
> the CPU explicitly. Once we're there, we can warn on illegal usages.

Ah, and the rest (the vast majority) can then be safely deflected away
from nohz_full cpus.

-Mike