Re: [RFC] Deadlock via recursive wakeup via RCU with threadirqs

From: Paul E. McKenney
Date: Sat Jun 29 2019 - 15:16:18 EST


On Sat, Jun 29, 2019 at 08:09:10PM +0200, Andrea Parri wrote:
> On Sat, Jun 29, 2019 at 09:55:33AM -0700, Paul E. McKenney wrote:
> > On Sat, Jun 29, 2019 at 05:12:36PM +0200, Andrea Parri wrote:
> > > Hi Steve,
> > >
> > > > As Paul stated, interrupts are synchronization points. Archs can only
> > > > play games with ordering when dealing with entities outside the CPU
> > > > (devices and other CPUs). But if you have assembly that has two stores,
> > > > and an interrupt comes in, the arch must guarantee that the stores are
> > > > done in that order as the interrupt sees it.
> > >
> > > Hopefully I'm not derailing the conversation too much with my questions
> > > ... but I was wondering if we had any documentation (or inline comments)
> > > elaborating on this "interrupts are synchronization points"?
> >
> > I don't know of any, but I would suggest instead looking at something
> > like the Hennessey and Patterson computer-architecture textbook.
> >
> > Please keep in mind that the rather detailed documentation on RCU is a
> > bit of an outlier due to the fact that there are not so many textbooks
> > that cover RCU. If we tried to replicate all of the relevant textbooks
> > in the Documentation directory, it would be quite a large mess. ;-)
>
> You know some developers considered it worth to develop formal specs in
> order to better understand concepts such as "synchronization" and "IRQs
> (processing)"! ... ;-) I still think that adding a few paragraphs (if
> only in informal prose) to explain that "interrupts are synchronization
> points" wouln't hurt. And you're right, I guess we may well start from
> a reference to H&P...
>
> Remark: we do have code which (while acknowledging that "interrupts are
> synchronization points") doesn't quite seem to "believe it", c.f., e.g.,
> kernel/sched/membarrier.c:ipi_mb(). So, I guess the follow-up question
> would be "Would we better be (more) paranoid? ..."

As Steve said that I said, they are synchronization points from the
viewpoint of code within the interrupted CPU. Unless the architecture
code does as smp_mb() on interrupt entry and exit (which perhaps some
do, for all I know, maybe all of them do by now), memory accesses could
still be reordered across the interrupt from the perspective of other
CPUs and devices on the system.

Thanx, Paul