Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer dereference at 000001f2

From: Josh Poimboeuf
Date: Wed Oct 04 2017 - 18:15:46 EST


On Wed, Oct 04, 2017 at 02:30:42PM -0700, Linus Torvalds wrote:
> On Wed, Oct 4, 2017 at 2:06 PM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >
> > I compiled the same kernel with a similar version of GCC. It turns out
> > that GCC *does* create unaligned stacks with frame pointers enabled:
>
> Christ. What a piece of crap.
>
> It doesn't even seem to make any sense. Spill room for the "u16
> item_count" and "u8 move_type"?

I didn't have the patience to try to figure out what it thought it was
trying to do. But it doesn't even use %esp in the function, so it's
obviously just pointless code. Pasting the full function below.

> That function is disgusting anyway (the switch really should be
> outside the loop, not inside it), but whatever. No excuse for that
> kind of garbage code generation.
>
> > This was a leaf function. For no apparent reason, GCC 4.8 decided to
> > subtract 3 from the stack pointer in the prologue.
>
> Can you make objtool warn about unaligned stack pointer additions like that?

That should be easy to add, and a good idea to check for. I don't look
forward to the continual 0-day bot complaints, so hopefully we can
figure out a way to fix them (or deprecate GCC 4!), or at least disable
the checking for older versions of GCC with the known issue.

> Maybe it only happens in very limited cases, and we can find a pattern
> to why gcc generates garbage code like that? And perhaps even how to
> just avoid it?

Just from grepping the objdump I can tell it happened in several
different functions. For example all the following functions have 'sub
$0x1,%esp':

match_wildcard()
pci_vpd_find_tag()
strspn()

sub $0x2,%esp:

rgb_foreground()
scsi_extd_sense_format()

sub $0x5,%esp:

iot2040_rs485_config()

sub $0x6,%esp:

x86_match_cpu()
clear_buffer_attributes()

The only thing I can see they have in common is that they're leaf
functions. In all cases the stack pointer is otherwise unused.

The problem seemed to go away when I built the same kernel with GCC 5.3.

FYI, here's the full function (4.14-rc3 compiled with RHEL GCC 4.8.3-9).
Config is attached.

c124a388 <acpi_rs_move_data>:
c124a388: 55 push %ebp
c124a389: 89 e5 mov %esp,%ebp
c124a38b: 57 push %edi
c124a38c: 56 push %esi
c124a38d: 89 d6 mov %edx,%esi
c124a38f: 53 push %ebx
c124a390: 31 db xor %ebx,%ebx
c124a392: 83 ec 03 sub $0x3,%esp
c124a395: 8a 55 08 mov 0x8(%ebp),%dl
c124a398: 66 89 4d f2 mov %cx,-0xe(%ebp)
c124a39c: 83 ea 15 sub $0x15,%edx
c124a39f: 88 55 f1 mov %dl,-0xf(%ebp)
c124a3a2: 0f b6 fa movzbl %dl,%edi
c124a3a5: 0f b7 4d f2 movzwl -0xe(%ebp),%ecx
c124a3a9: 39 cb cmp %ecx,%ebx
c124a3ab: 73 36 jae c124a3e3 <acpi_rs_move_data+0x5b>
c124a3ad: 80 7d f1 07 cmpb $0x7,-0xf(%ebp)
c124a3b1: 77 30 ja c124a3e3 <acpi_rs_move_data+0x5b>
c124a3b3: ff 24 bd cc e9 36 c1 jmp *-0x3ec91634(,%edi,4)
c124a3b6: R_386_32 .rodata
c124a3ba: 89 c7 mov %eax,%edi
c124a3bc: f3 a4 rep movsb %ds:(%esi),%es:(%edi)
c124a3be: eb 23 jmp c124a3e3 <acpi_rs_move_data+0x5b>
c124a3c0: 66 8b 0c 5e mov (%esi,%ebx,2),%cx
c124a3c4: 66 89 0c 58 mov %cx,(%eax,%ebx,2)
c124a3c8: eb 16 jmp c124a3e0 <acpi_rs_move_data+0x58>
c124a3ca: 8b 0c 9e mov (%esi,%ebx,4),%ecx
c124a3cd: 89 0c 98 mov %ecx,(%eax,%ebx,4)
c124a3d0: eb 0e jmp c124a3e0 <acpi_rs_move_data+0x58>
c124a3d2: 8b 14 de mov (%esi,%ebx,8),%edx
c124a3d5: 8b 4c de 04 mov 0x4(%esi,%ebx,8),%ecx
c124a3d9: 89 14 d8 mov %edx,(%eax,%ebx,8)
c124a3dc: 89 4c d8 04 mov %ecx,0x4(%eax,%ebx,8)
c124a3e0: 43 inc %ebx
c124a3e1: eb c2 jmp c124a3a5 <acpi_rs_move_data+0x1d>
c124a3e3: 83 c4 03 add $0x3,%esp
c124a3e6: 5b pop %ebx
c124a3e7: 5e pop %esi
c124a3e8: 5f pop %edi
c124a3e9: 5d pop %ebp
c124a3ea: c3 ret

Attachment: .config.gz
Description: application/gunzip