On Sat, Apr 18, 2015 at 7:40 PM, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:I'll try. It must be very early in the boot process, prior to console
On Sat, Apr 18, 2015 at 04:23:25PM -0700, Guenter Roeck wrote:
my qemu test for arm:vexpress fails with the latest upstream kernel. It fails
hard - I don't get any output from the console. Bisect points to commit
8053871d0f7f ("smp: Fix smp_call_function_single_async() locking").
Reverting this commit fixes the problem.
Hmm. It being qemu, can you look at where it seems to lock?
I applied the above. No difference. Applying the same change for the cpu ==Additional observation: The system boots if I add "-smp cpus=4" to the qemu
options. It does still hang, however, with "-smp cpus=2" and "-smp cpus=3".
Funky.
That patch still looks obviously correct to me after looking at it
again, but I guess we need to revert it if somebody can't see what's
wrong.
It does make async (wait=0) smp_call_function_single() possibly be
*really* asynchronous, ie the 'csd' ends up being released and can be
re-used even before the call-single function has completed. That
should be a good thing, but I wonder if that triggers some ARM bug.
Instead of doing a full revert, what happens if you replace this part:
+ /* Do we wait until *after* callback? */
+ if (csd->flags & CSD_FLAG_SYNCHRONOUS) {
+ func(info);
+ csd_unlock(csd);
+ } else {
+ csd_unlock(csd);
+ func(info);
+ }
with just
+ func(info);
+ csd_unlock(csd);
ie keeping the csd locked until the function has actually completed? I
guess for completeness, we should do the same thing for the cpu ==
smp_processor_id() case (see the "We can unlock early" comment).
Now, if that makes a difference, I think it implies a bug in the
caller, so it's not the right fix, but it would be an interesting
thing to test.