Re: md-raid 6.11.8 page fault oops
From: Ojaswin Mujoo
Date: Wed Nov 20 2024 - 05:18:57 EST
On Tue, Nov 19, 2024 at 08:31:02AM -0500, Genes Lists wrote:
> On Tue, 2024-11-19 at 17:04 +0530, Ojaswin Mujoo wrote:
> > >
> ...
>
> > > (gdb) list *(rb_first+0x13)
> > > 0xffffffff81de1af3 is in rb_first (lib/rbtree.c:473).
> > > 468 struct rb_node *n;
> > > 469
> > > 470 n = root->rb_node;
> > > 471 if (!n)
> > > 472 return NULL;
> > > 473 while (n->rb_left)
> >
> > Now this looks strange, we already make sure n is not NULL and then
> > somehow this line ends up in
> >
> > BUG: unable to handle page fault for address: 0000000000200010
> >
> > Now, decoding the code with an x86 vmlinux, I see the fauling opcode
> > faulting:
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
> > 7: 90 nop
> > 8: 90 nop
> > 9: 90 nop
> > a: 90 nop
> > b: 90 nop
> > c: 90 nop
> > d: 90 nop
> > e: 90 nop
> > f: 90 nop
> >
> > Now RAX is 0x200000 but I don't think the nopl instruction should
> > have resulted
> > in a mem access AFA my limited understanding of x86 ISA goes.
> >
> > I also don't see nopl in my vmlinux in rb_first, my binary being
> > compiled with
> > gcc 8.5. Are you by chance using clang or higher version or higher
> > optimization in gcc.
> >
> > Regards,
> > ojaswin
>
> I am using Arch toolchain with
>
> gcc 14.2.1+r134+gab884fffe3fc-1
>
> I do not set CFLAGS_KERNEL so compile options are the default.
Got it, I'm still not sure what might be causing this oops. Would you
happen to a have a reproducer that I can play around with on my system?
Regards,
ojaswin
>
> thanks
>
> gene
>
>