Re: [PATCH 1/3] irq_work: Implement remote queueing

From: Frederic Weisbecker
Date: Wed May 14 2014 - 09:51:28 EST


On Wed, May 14, 2014 at 02:41:50PM +0200, Peter Zijlstra wrote:
> On Wed, May 14, 2014 at 02:11:25PM +0200, Frederic Weisbecker wrote:
> > > I don't think it is, most apic calls do apic_wait_icr_idle() then the
> > > apic op, if an NMI happens in between and writes to the APIC, the return
> > > context will see a !idle icr and fail.
> > >
> > > This is why arch_irq_work_raise() again idles the icr after sending the
> > > IPI.
> > >
> > > Also, I think, seeing what benh said earlier, its unsafe for other archs
> > > too.
> >
> > Ah I don't know much these archs details, so I concede it.
>
> Yeah, I didn't either, had to figure it out when someone asked WTH there
> was an wait_icr_idle call in there.
>
> > > Then do the remote irq_work_raised thing. But it really stinks you broke
> > > this very nice and simple thing.
> >
> > I tried not to break boot with printk overhead. That said I've considered having
> > a very simple "tick work" that can rely on irq work when the tick is stopped
> > and use it for printk. That would restore the initial simplicity.
>
> But but but.. did you even try without the lazy thing?
>
> Don't fix what ain't broken, keep it simple, etc..

The problem is that I may well see no significant issues in my small and common hardware
but the problem may hit on boxes with specific configs or big numbers of CPUs.

"Don't fix what ain't broken" here clashes with "lets stay conservative/paranoid"
to avoid bringing new bugs. Before printk used irq_work we had printk_tick(),
I simply kept the old behaviour to avoid breaking boot time on other boxes.

>
> Anyway, if it turns out to really be needed, the split list doesn't
> sound bad.

Either that or we can remove LAZY stuff and wait to see if people complain :)

>
> > > > Also note that nohz is the only user for now and irq_work_claim() thus
> > > > prevents from double IPI. Of course if more users come up the issue arise
> > > > again.
> > >
> > > DANGER, half arsed engineering at work, seriously? Just write proper
> > > code already.
> > >
> > > There's no fucking way the next user will check the implementation to
> > > make sure its 'sane'.
> >
> > Are you competing with tglx on grumpiness? You guys are free to treat us
> > like shit but don't be surprised if one day you'll be alone in kernel/*
>
> There's really only so much nonsense one can take on any one day before
> getting seriously grumpy.
>
> And arguing that because there's only one user so we can skimp a core
> function really tops the day.
>
> So maybe I need a holiday, but shees.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/