Re: Rootfs in eMMC: Kernel panic ...Attempted to kill init!

From: Catalin Marinas
Date: Mon Jun 22 2009 - 12:13:31 EST


On Mon, 2009-06-22 at 16:56 +0100, Russell King - ARM Linux wrote:
> On Mon, Jun 22, 2009 at 04:50:46PM +0100, Catalin Marinas wrote:
> > On Mon, 2009-06-22 at 16:43 +0100, Russell King - ARM Linux wrote:
> > > On Mon, Jun 22, 2009 at 07:43:40PM +0530, Sudeep K N wrote:
> > > > Thanks for the suggestion.
> > > > With the logs it is clear that crash is in the userspace.
> > > > I am getting one of the 2 logs(below) randomly.
> > > > >From trial#2,
> > > > pgd = c60bc000
> > > > [00000000] *pgd=061ee031, *pte=00000000, *ppte=00000000
> > > > I could understand that the page tables are not proper.
> > > > I am not able understand how to proceed.
> > > >
> > > > Trial#1:
> > > > VFS: Mounted root (ext2 filesystem).
> > > > Freeing init memory: 108K
> > > > linuxrc (1): undefined instruction: pc=40008100
> > > > Code: e08e3003 eb002842 e2801008 e58c217c (e0812103)
> > >
> > > Your processor is misbehaving; none of the above hex codes are undefined
> > > instructions, so you shouldn't be taking an undefined instruction trap.
> >
> > The undefined instruction aborts are possible in this situation since
> > instructions are fetched via the I-cache while the abort handler shows
> > the code via the D-cache.
>
> However, you're missing a very important point.

Well, I get this kind of errors (with /sbin/init) every time I try ext2
on CompactFlash (with pata_platform). You could try with USB as well on
a RealView/EB+ARM11MPCore board.

> This early on, the I-cache for the non-kernel pages won't contain any
> entries except those placed there by this first binary - it's the very
> first user process which is receiving these exceptions.

The problem is not the I-cache, it is just fetched from the main memory.

> Second point is that the page concerned has only recently been mapped
> into that page. I would be very very surprised if speculative
> instruction prefetch managed to dirty the exact right page via the
> kernel mapping to always cause the first process to fail in some way.

This has nothing to do with speculative prefetching. It's just that the
I-cache is being filled with data from main memory but the D-cache
wasn't flushed (on ARM SMP systems, the D-cache is write-allocate making
this more visible).

Could you or Sudeep clarify whether the driver uses DMA or PIO?

In my case (ext2 over pata_platform), there is no flush_dcache_page()
call after the page was written with data from the CompactFlash (neither
the driver nor the VFS layer do this and we used hardware tracing to
double-check). When the page is mapped into user space,
update_mmu_cache() is called but the page hasn't been marked as dirty
and no D-cache flushing occurs. Calling flush_dcache_page() in
mpage_end_io_read() works around this issue.

--
Catalin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/