Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf

From: Andy Lutomirski
Date: Fri Sep 08 2017 - 12:13:25 EST


On Fri, Sep 8, 2017 at 4:30 AM, Markus Trippelsdorf
<markus@xxxxxxxxxxxxxxx> wrote:
> On 2017.09.08 at 12:39 +0200, Markus Trippelsdorf wrote:
>> On 2017.09.08 at 12:35 +0200, Ingo Molnar wrote:
>> >
>> > * Markus Trippelsdorf <markus@xxxxxxxxxxxxxxx> wrote:
>> >
>> > > On 2017.09.08 at 11:16 +0200, Borislav Petkov wrote:
>> > > > On Fri, Sep 08, 2017 at 10:05:36AM +0200, Borislav Petkov wrote:
>> > > > > On Fri, Sep 08, 2017 at 08:26:44AM +0200, Thomas Gleixner wrote:
>> > > > > > On Fri, 8 Sep 2017, Markus Trippelsdorf wrote:
>> > > > > >
>> > > > > > CC+ Borislav. He might have access to such a beast
>> > > > >
>> > > > > Can I have /proc/cpuinfo and dmesg pls, in order to see whether I have
>> > > > > something similar?
>> > > > >
>> > > > > Private mail's fine too.
>> > > >
>> > > > So I don't have exactly your model - mine is model 2, stepping 3 but I see
>> > > > something strange too, in dmesg:
>> > >
>> > > I'm pretty sure the bug is in the merged 'x86-mm-for-linus' branch:
>> > > Either Andy's "PCID optimized TLB flushing" (would be my guess) or
>> > > 'encrypted memory' support by Tom Lendacky.
>> > >
>> > > (Bisecting is hard, because sometimes I can compile stuff for over 15
>> > > minutes without hitting the bug. At other times the machine locks up
>> > > hard when starting X11 already.)
>> >
>> > Do you have the 72c0098d92ce fix?
>>
>> Yes. The bug still happens on the current git tree (which has the fix
>> already):
>
> The bug is definitely caused by Andy Lutomirski's PCID optimized TLB
> flushing" patches. Tom is off the hook.

I'm pretty sure it can't be PCID per se, since these CPUs are way too
old and are very unlikely to have PCID.

It could plausibly be the lazy TLB flushing changes.