Re: [PATCH tip/core/rcu 1/2] rcu: Parallelize and economize NOCB kthread wakeups

From: Amit Shah
Date: Wed Aug 13 2014 - 01:46:33 EST


On (Tue) 12 Aug 2014 [14:41:51], Paul E. McKenney wrote:
> On Tue, Aug 12, 2014 at 02:39:36PM -0700, Paul E. McKenney wrote:
> > On Tue, Aug 12, 2014 at 09:06:21AM -0700, Paul E. McKenney wrote:
> > > On Tue, Aug 12, 2014 at 11:03:21AM +0530, Amit Shah wrote:
> >
> > [ . . . ]
> >
> > > > I know of only virtio-console doing this (via userspace only,
> > > > though).
> > >
> > > As in userspace within the guest? That would not work. The userspace
> > > that the qemu is running in might. There is a way to extract ftrace info
> > > from crash dumps, so one approach would be "sendkey alt-sysrq-c", then
> > > pull the buffer from the resulting dump. For all I know, there might also
> > > be some script that uses the qemu "x" command to get at the ftrace buffer.
> > >
> > > Again, I cannot reproduce this, and I have been through the code several
> > > times over the past few days, and am not seeing it. I could start
> > > sending you random diagnostic patches, but it would be much better if
> > > we could get the trace data from the failure.

I think the only recourse I now have is to dump the guest state from
qemu, and attempt to find the ftrace buffers by poking pages and
finding some ftrace-like struct... and then dumping the buffers.

> > Hearing no objections, random patch #1. The compiler could in theory
> > cause trouble without this patch, so there is some possibility that
> > it is a fix.
>
> #2... This would have been a problem without the earlier patch, but
> who knows? (#1 moved from theoretically possible but not on x86 to
> maybe on x86 given a sufficiently malevolent compiler with the
> patch that you located with bisection.)

I tried all 3 patches individually, and all 3 together, no success.

My gcc is gcc-4.8.3-1.fc20.x86_64. I'm using a fairly uptodate Fedora
20 system on my laptop for these tests.

Curiously, patches 1 and 3 applied fine, but this one had a conflict.

> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 1dc72f523c4a..1da605740e8d 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -2137,6 +2137,17 @@ static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp,

I have this hunk at line 2161, and...

> trace_rcu_callback(rdp->rsp->name, rhp,
> -atomic_long_read(&rdp->nocb_q_count_lazy),
> -atomic_long_read(&rdp->nocb_q_count));
> +
> + /*
> + * If called from an extended quiescent state with interrupts
> + * disabled, invoke the RCU core in order to allow the idle-entry
> + * deferred-wakeup check to function.
> + */
> + if (irqs_disabled_flags(flags) &&
> + !rcu_is_watching() &&
> + cpu_online(smp_processor_id()))
> + invoke_rcu_core();
> +
> return true;

I have return 1; here.

I'm on linux.git, c8d6637d0497d62093dbba0694c7b3a80b79bfe1.


Amit
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/