Re: 2.6.17-mm6

From: Andrew Morton
Date: Mon Jul 03 2006 - 19:29:47 EST


On Mon, 3 Jul 2006 22:50:02 +0100
Alistair John Strachan <s0348365@xxxxxxxxxxxx> wrote:

> On Monday 03 July 2006 21:54, Andrew Morton wrote:
> > On Mon, 3 Jul 2006 21:36:55 +0100
> > Alistair John Strachan <s0348365@xxxxxxxxxxxx> wrote:
> > > On Monday 03 July 2006 21:17, Andrew Morton wrote:
> > > > On Mon, 3 Jul 2006 20:56:28 +0100
> > > > Alistair John Strachan <s0348365@xxxxxxxxxxxx> wrote:
> > > > > On Monday 03 July 2006 20:39, Andrew Morton wrote:
> [snip]
> > > > > > Try adding `pause_on_oops=100000' to the kernel boot command line.
> > > > >
> > > > > (Trimmed Nathan)
> > > > >
> > > > > Helped somewhat, but I'm still missing a bit at the top.
> > > > >
> > > > > http://devzero.co.uk/~alistair/oops-20060703/
> > > >
> > > > That is irritating. This should get us more info:
> > >
> > > Indeed, thanks.
> > >
> > > Try the same URL again, I've uploaded 3,4,5 from a couple of reboots. I
> > > still think I'm missing something at the top, but 3 is the earliest I
> > > could snap.
> >
> > Getting better.
> >
> > It would kinda help if pause_on_oops() was actually implemented on x86_64..
>
> Doesn't help (work?).
>
> [alistair] 22:47 [~] strings /boot/vmlinuz-2.6.17-mm6 | grep 2.6.17-mm6
> 2.6.17-mm6 (alistair@damocles) #3 SMP PREEMPT Mon Jul 3 22:39:54 BST 2006
>
> [alistair] 22:48 [~] cat /boot/grub/menu.lst | grep -C1 mm6
> # testing
> title Linux 2.6.17-mm6
> root (hd0,0)
> kernel /boot/vmlinuz-2.6.17-mm6 vga=extended root=/dev/sda1
> pause_on_oops=100000
>
> I'm fairly sure I booted a kernel with your patch and that should be the right
> cmdline flag.
>

OK, x86_64 is significantly different from x86 in that area (better). Have
a tested version...


diff -puN arch/x86_64/kernel/traps.c~x86_64-wire-up-oops_enter-oops_exit arch/x86_64/kernel/traps.c
--- a/arch/x86_64/kernel/traps.c~x86_64-wire-up-oops_enter-oops_exit
+++ a/arch/x86_64/kernel/traps.c
@@ -551,11 +551,14 @@ void __kprobes __die(const char * str, s

void die(const char * str, struct pt_regs * regs, long err)
{
- unsigned long flags = oops_begin();
+ unsigned long flags;

+ oops_enter();
+ flags = oops_begin();
handle_BUG(regs);
__die(str, regs, err);
oops_end(flags);
+ oops_exit();
do_exit(SIGSEGV);
}

diff -puN arch/x86_64/mm/fault.c~x86_64-wire-up-oops_enter-oops_exit arch/x86_64/mm/fault.c
--- a/arch/x86_64/mm/fault.c~x86_64-wire-up-oops_enter-oops_exit
+++ a/arch/x86_64/mm/fault.c
@@ -261,9 +261,11 @@ int unhandled_signal(struct task_struct
static noinline void pgtable_bad(unsigned long address, struct pt_regs *regs,
unsigned long error_code)
{
- unsigned long flags = oops_begin();
+ unsigned long flags;
struct task_struct *tsk;

+ oops_enter();
+ flags = oops_begin();
printk(KERN_ALERT "%s: Corrupted page table at address %lx\n",
current->comm, address);
dump_pagetable(address);
@@ -273,6 +275,7 @@ static noinline void pgtable_bad(unsigne
tsk->thread.error_code = error_code;
__die("Bad pagetable", regs, error_code);
oops_end(flags);
+ oops_exit();
do_exit(SIGKILL);
}

@@ -562,6 +565,7 @@ no_context:
* terminate things with extreme prejudice.
*/

+ oops_enter();
flags = oops_begin();

if (address < PAGE_SIZE)
@@ -578,6 +582,7 @@ no_context:
/* Executive summary in case the body of the oops scrolled away */
printk(KERN_EMERG "CR2: %016lx\n", address);
oops_end(flags);
+ oops_exit();
do_exit(SIGKILL);

/*
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/