Re: Weird Oops on 2.2.13 (how to debug this one?)

Richard B. Johnson (root@chaos.analogic.com)
Mon, 22 Nov 1999 18:17:05 -0500 (EST)


On Mon, 22 Nov 1999, Linux Lists wrote:

>
> Hello,
>
> I'm debugging a new synchronous card driver, and even though the driver
> seems stable, sometimes I get kernel oopsen like the one below when doing
> an FTP reception transfer:
>
> Unable to handle kernel paging request at virtual address c4834824
> current->tss.cr3 = 00101000, %cr3 = 00101000
> *pde = 00000000
> Oops: 0000
> CPU: 0
> EIP: 0010:[<c4834824>]
> EFLAGS: 00010292
> eax: 00000009 ebx: 00000002 ecx: 00000001 edx: c281908c
> esi: c2819070 edi: 00000001 ebp: c289e048 esp: c01dff1c
> ds: 0018 es: 0018 ss: 0018
> Process swapper (pid: 0, process nr: 0, stackpage=c01df000)
> Stack: f1e85200 83fffb77 c8e904c4 90000000 28244c8b 8308798b 387501ff
> 448b016a
> e8504424 fffb8264 4824748b f008c483 187eab0f c085c019 016a1a75
> 76bee856
> c483fffb 287e8308 56097402 fb76efe8 04c483ff 4024548b 0974d285
> 7796e852
> Call Trace:
> Code:
>
> The weird thing about it is that there is no call trace nor code to be
> used in the ksymoops. If I try to use all I got in ksymoops (i.e., the
> output shown above), it doesn't return anything, i.e., it stays there
> waiting for actually useful data to come in.
>
> Is there any way to figure out what is causing this kernel oops !?!?
> Without the call trace / code, it's useless to me, so if somebody could
> shed some light on it for me, I'd really appreciate it.
>

You probably don't have any global symbols that ksymoops can use in your
driver. You can always load your driver with debugging info, i.e.,

#ifdef DEBUG
#define DEB(f) f
#else
#define DEB(f)
#endif

DEB(printk("address was %p\n", foo));

In this case, if DEBUG is not defined, DEB(x) goes away entirely.

Since the paging address looks reasonable, it looks as though there
is some reason why a page-fault could not be done. This is usually
caused by somebody trying to write to user-space while holding a lock.

This cannot be done. Also space dynamically allocated in an interrupt
has to be GFP_ATOMIC. If you need to copy this data to user-space
under a lock, it has to be done in two steps:

spinlock_t device_lock;
long flags;
spin_lock_irqsave(&device_lock, flags); /* No interrupts */
memcpy(tmp_buffer, interrupt_buffer, length); /* Get interrupt data */
spin_unlock_irqrestore(&device_lock, flags); /* Allow interrupts */
copy_to_user(user_buf, tmp_buffer, length); /* Copy to user */

Cheers,
Dick Johnson

Penguin : Linux version 2.3.13 on an i686 machine (400.59 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/