Re: general protection fault vs Oops

From: Subhashini Rao Beerisetty
Date: Sat May 16 2020 - 11:11:03 EST


On Sat, May 16, 2020 at 7:23 PM Valdis KlÄtnieks
<valdis.kletnieks@xxxxxx> wrote:
>
> On Sat, 16 May 2020 18:05:07 +0530, Subhashini Rao Beerisetty said:
>
> > In the first attempt when I run that test case I landed into âgeneral
> > protection fault: 0000 [#1] SMP" .. Next I rebooted and ran the same
> > test , but now it resulted the âOops: 0002 [#1] SMP".
>
> And the 0002 is telling you that there's been 2 previous bug/oops since the
> reboot, so you need to go back through your dmesg and find the *first* one.
I could not find Oops: 0001 in kern.log.
Actually I captured the crash call trace by running tail -f
/var/log/kern.log. But after reboot, I could not find the same in
kern.log file. Why the kernel failed to store in kern.log? In that
case how does tail command captured? Could you please clarify on
this..

$strings /var/log/kern.log | grep -i oops

>
> > In both cases the call trace looks exactly same and RIP points to
> > ânative_queued_spin_lock_slowpath+0xfe/0x170"..
>
> The first few entries in the call trace are the oops handler itself. So...
>
>
> > May 16 12:06:17 test-pc kernel: [96934.567347] Call Trace:
> > May 16 12:06:17 test-pc kernel: [96934.569475] [<ffffffff8183c427>]__raw_spin_lock_irqsave+0x37/0x40
> > May 16 12:06:17 test-pc kernel: [96934.571686] [<ffffffffc0606812>] event_raise+0x22/0x60 [osa]
> > May 16 12:06:17 test-pc kernel: [96934.573935] [<ffffffffc06aa2a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]
>
> The above line is the one where you hit the wall.
>
> > May 16 12:59:22 test-pc kernel: [ 3011.405602] Call Trace:
> > May 16 12:59:22 test-pc kernel: [ 3011.407892] [<ffffffff8183c427>] _raw_spin_lock_irqsave+0x37/0x40
> > May 16 12:59:22 test-pc kernel: [ 3011.410256] [<ffffffffc0604812>] event_raise+0x22/0x60 [osa]
> > May 16 12:59:22 test-pc kernel: [ 3011.412652] [<ffffffffc06b72a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]
>
> And again.
>
> However, given that it's a 4.4 kernel from 4 years ago, it's going to be
> hard to find anybody who really cares.
>
> In fact. I'm wondering if this is from some out-of-tree or vendor patch,
> because I'm not finding any sign of that function in either the 5.7 or 4.4
> tree. Not even a sign of ## catenation abuse - no relevant hits for
> "completed_one_buffer" or "multi_q" either
>
> I don't think anybody's going to be able to help unless somebody first
> identifies where that function is....
>