Re: Another preempt folding issue?

From: Stefan Bader
Date: Fri Feb 14 2014 - 14:04:00 EST


On 14.02.2014 19:23, Stefan Bader wrote:
> On 14.02.2014 18:33, Borislav Petkov wrote:
>> On Fri, Feb 14, 2014 at 06:02:32PM +0100, Stefan Bader wrote:
>>> Okaaay, I think I did what you asked. So yes, there is sse2 in the cpu info. And
>>> there is a mfence in the disassembly:
>>
>> Btw, I just realized booting the kernel in the guest was a dumb idea,
>> because, doh, the guest is not baremetal. The only reliable thing we
>> can say is that sse2 is present and that MFENCE alternative replacement
>> works :)
>>
>> But for simplicity's sake let's just assume the machine can do MFENCE
>> just fine and it gets replaced by the alternatives code.
>>
>> Besides, if that weren't true, we'd have a whole lot of other problems
>> on those boxes.
>>
>>> Thinking about it, I guess Peter is quite right saying that I likely
>>> will end on the patch that converted preempt_count to percpu.
>>
>> Yeah, c2daa3bed53a81171cf8c1a36db798e82b91afe8 et al.
>>
>>> One thing I likely should do is to reinstall the exact same laptop
>>> with 64bit kernel and userspace... maybe only 64bit kernel first...
>>> and make sure on my side that this does not show up on 64bit, too. I
>>> took the word of reporters for that (and the impression that otherwise
>>> many more people would have complained).
>>
>> Yeah, that should be a prudent thing to do.
>>
>> Also, Paolo and I were wondering whether you can trigger this thing
>> without kvm, i.e. virtualization involved... do you have any data on
>> that?
>
> Unfortunately no hard evidence. Kvm just happens to be such a good way to notice
> this as it is using the reschedule interrupt itself and has this exit before
> running the guest vcpu to hadnle it in the outer loop by calling cond_resched()
> and repeat.
> I find running kvm seems to make that laptop quite sluggish in responding to
> other tasks (in that install) and I got some oddness going on when lightdm quite
> often refuses to take keyboard input without opening some menu with the mouse
> first... But I could not be sure whether that is the kernel or some new
> user-space ... errr "feature".
> At least Marcello (iirc that other report came from him directly or indirectly)
> has seen it, too. And he likely has complete different user-space.
>
> So I will go and do that different (64bit) kernel and kernel + user-space test.
> But like fo Peter, it likely is a Monday thing...
>

Ok, it is still Friday... So a quick test (2 boots of a 32bit guest, same as
before) on the 32bit user-space, with the same kernel source, but compiled as
64bit (obviously not 100% same config but close). While I see the false
inconsistency messages (I modified the in kernel-test to trigger the trace stop
only if the __vcpu_run loop encounters an inconsistent state three times in a
row), I do not see the final stop message. Also (but that is rather feeling) the
system seems to remain more responsive (switching to other windows, opening
terminal windows,...) compared to 32bit kernel.


Attachment: signature.asc
Description: OpenPGP digital signature