Re: PROBLEM: consolidated IDT invalidation causes kexec to reboot

From: hpa
Date: Tue Dec 26 2017 - 18:51:28 EST


On December 26, 2017 3:19:00 PM PST, Alexandru Chirvasitu <achirvasub@xxxxxxxxx> wrote:
>On Tue, Dec 26, 2017 at 10:51:12AM -0800, Linus Torvalds wrote:
>> [ Sorry, I was off-line on Christmas Eve due to festivities, and then
>> yesterday because I've apparently caught a cold.
>>
>> Still not back to normal, but at least I can sit in front of the
>> computer again ]
>>
>> On Mon, Dec 25, 2017 at 1:29 PM, Alexandru Chirvasitu
>> <achirvasub@xxxxxxxxx> wrote:
>> >
>> > On Mon, Dec 25, 2017 at 06:40:14AM -0800, Andy Lutomirski wrote:
>> >>
>> >> This is presumably the same call-tracing-without-TLS-working
>problem.
>> >> idt_invalidate() is out-of-line and is compiled with full tracing
>on,
>>
>> Yeah. The difference literally seems to be that in the old case it
>was
>> accidentally inlined.
>>
>> I say "accidentally", because "load_idt()" itself is explicitly
>> inlined, but the "set_idt()" in machine_kexec_32.c was not.
>>
>> But before that commit ("e802a51ede91 x86/idt: Consolidate IDT
>> invalidation") the compiler inlined it anyway because it was a small
>> static function.
>>
>> Afterwards, not so much (different C file), and then the stack
>tracing
>> code blew up because of the incomplete CPU state.
>>
>> > This works.. I went back to the troublesome commit e802a51 and
>> > modified it as follows:
>> >
>> > +/**
>> > + * idt_invalidate - Invalidate interrupt descriptor table
>> > + * @addr: The virtual address of the 'invalid' IDT
>> > + */
>> > +static inline void idt_invalidate(void *addr)
>> > +{
>> > + struct desc_ptr idt = { .address = (unsigned long) addr,
>.size = 0 };
>> > +
>> > + load_idt(&idt);
>> > +}
>>
>> Yes, I suspect that is the right thing to do. It's small enough that
>> inliningh it makes sense.
>>
>> HOWEVER. Would you mind testing a totally different fix instead?
>>
>> In particular, take the current top of tree (that doesn't work for
>> you), and try to just change the order of these two lines:
>>
>> set_gdt(phys_to_virt(0), 0);
>> idt_invalidate(phys_to_virt(0));
>>
>> in arch/x86/kernel/machine_kexec_32.c.
>>
>> I think it's a better idea to invalidate the IDT first, because that
>> is only used for exceptions. In contrast, invalidating the GDT will
>> obviously make any segment load do horrible things, _and_ any
>> exceptions would fail anyway (because exceptions need segments too).
>>
>> So in many ways, that "set_get()" that invalidates the GDT is the
>more
>> destructive thing, and should be done last.
>>
>> And if we do it last, maybe the whole "oops, we have tracing code
>> enabled" thing wouldn't have mattered.
>>
>> Does that trivial line switching make the old broken config work for
>you again?
>
>I've tried three more kernels just now:
>
>(1)
>
>I went back to the initial problematic commit e802a51 and modified it
>as you suggest:
>
>----------------------------------------------------------
>
> interchanged invalidation instructions in machine_kexec_32
>
>diff --git a/arch/x86/kernel/machine_kexec_32.c
>b/arch/x86/kernel/machine_kexec_32.c
>index 00bc751..4ebf6bf 100644
>--- a/arch/x86/kernel/machine_kexec_32.c
>+++ b/arch/x86/kernel/machine_kexec_32.c
>@@ -232,8 +232,8 @@ void machine_kexec(struct kimage *image)
> * The gdt & idt are now invalid.
> * If you want to load them you must set up your own idt & gdt.
> */
>- set_gdt(phys_to_virt(0), 0);
> idt_invalidate(phys_to_virt(0));
>+ set_gdt(phys_to_virt(0), 0);
>
> /* now call it */
> image->start = relocate_kernel_ptr((unsigned long)image->head,
>
>----------------------------------------------------------
>
>This did not work out for me, but now it fails differently. Both
>(kexec -l + kexec -e) and (kexec -p + echo c > /proc/sysrq-trigger)
>end in call traces and freezes.
>
>It does seem to be tied to idt_invalidate. One of the last things I
>see on the screen (which is ends up frozen with the computer inactive)
>is
>
>EIP: idt_invalidate+0x6/0x40 SS:ESP: 0068:f6c47cd0
>
>Well, that address at the end changes on different iterations of
>this. I also see the usual 'Kernel panic: not syncing', as well as
>
>Shuttind down CPUs with NMI
>
>and another worrisome line higher up:
>
>CPU:0 PID: 682 comm: kexec Tainted G
>
>None of this seems to register in logs I can send. For instance, I've
>grepped -r for 'invalidate' in /var/log/ with no hits.
>
>
>
>(2)
>
>In order to look into the argument-of-idt_invalidate issue, I took
>commit (1) above and changed it to
>
>----------------------------------------------------------
>
> call idt_invalidate(0) in machine_kexec_32
>
>diff --git a/arch/x86/kernel/machine_kexec_32.c
>b/arch/x86/kernel/machine_kexec_32.c
>index 4ebf6bf..71bd3c0 100644
>--- a/arch/x86/kernel/machine_kexec_32.c
>+++ b/arch/x86/kernel/machine_kexec_32.c
>@@ -232,7 +232,7 @@ void machine_kexec(struct kimage *image)
> * The gdt & idt are now invalid.
> * If you want to load them you must set up your own idt & gdt.
> */
>- idt_invalidate(phys_to_virt(0));
>+ idt_invalidate(0);
> set_gdt(phys_to_virt(0), 0);
>
> /* now call it */
>
>----------------------------------------------------------
>
>Same issues as noted in (1) above. I suppose we were expecting this,
>but since I'm doing this anyway, I figured might as well do it with
>some degree of thoroughness.
>
>
>
>(3)
>
>Also related to the argument issue: I went back to the commit I
>described in my previous message (making idt_invalidate static inline
>in the header) and made the identical argument-0 modification to *that*
>
>----------------------------------------------------------
>
> call idt_invalidate(0) in machine_kexec_32
>
>diff --git a/arch/x86/kernel/machine_kexec_32.c
>b/arch/x86/kernel/machine_kexec_32.c
>index 00bc751..36c1b27 100644
>--- a/arch/x86/kernel/machine_kexec_32.c
>+++ b/arch/x86/kernel/machine_kexec_32.c
>@@ -233,7 +233,7 @@ void machine_kexec(struct kimage *image)
> * If you want to load them you must set up your own idt & gdt.
> */
> set_gdt(phys_to_virt(0), 0);
>- idt_invalidate(phys_to_virt(0));
>+ idt_invalidate(0);
>
> /* now call it */
> image->start = relocate_kernel_ptr((unsigned long)image->head,
>
>----------------------------------------------------------
>
>All good here. So passing the argument 0 to idt_invalidate seems to
>make no difference, either to kexec or to reboot (which also works
>fine).
>
>
>
>
>>
>> > kexec now works as expected; tested repeatedly, both with direct
>> > execution and crash triggering.
>> >
>> > I had to google 'inline function' :)).
>>
>> We'll make a kernel developer out of you yet. You've already found
>the
>> most important development tool (I kid, I kid. Google is useful, but
>> "willingness to try things out" is actually the #1 thing).
>>
>> Mind googling "linux kernel patch submission" and adding the required
>> sign-off, and I suspect the x86 people will happily take your patch?
>>
>> That said, I do wonder about a few things:
>>
>> - the 'addr' argument is pointless, afaik. I *suspect* it used to be
>> 0, and then some mindless editing just changed it to that
>> "phys_to_virt(0)".
>>
>> With a zero length, it shouldn't matter what the actual IDT base
>> address actually is. Any access is going to trap regardless.
>>
>> - some people were clearly aware of just how critical that whole
>> "load_idt()" sequence were, because things were marked "inline" and
>> "NOKPROBE_SYMBOL()" etc, but there was no comment in the code that
>> actually did this about how the machine state is total garbage after
>> the "set_gdt()" in machine_kexec().
>>
>> - the above "I think we should invalidate GDT last" issue.
>>
>> Hmm?
>>
>> Linus

Disassembly please.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.