Re: Possible dcache BUG

From: Gene Heskett
Date: Thu Aug 19 2004 - 04:45:39 EST


On Tuesday 17 August 2004 07:57, Nick Piggin wrote:
>Gene Heskett wrote:
>> On Tuesday 17 August 2004 00:58, Nick Piggin wrote:
>>>Gene Heskett wrote:
>>>>Reboot time I guess :(((
>>>
>>>All your low memory has been used by dentry and inode caches. This
>>>isn't very
>>>interesting because this would be no doubt caused by something
>>>oopsing while holding the shrinker semaphore as Andrew pointed
>>> out.
>>>
>>>What is interesting is that first Oops message (I wonder if you
>>>don't have bad hardware though, I don't think anyone else is
>>> seeing it).
>>
>> What 'first Oops message'? One I posted before?
>
>Well, the first Oops that your running kernel raises. Usually you
>don't bother about subsequent oopses and misbehaviour because the
>first one can cause the system to go into a funny state - this is
>a prime example.
>
>> That comment caused me to go back in the log to well above where I
>> had been channel surfing with tvtime, and I did find an Oops:
>>
>> Aug 16 21:15:46 coyote kernel: Unable to handle kernel NULL
>> pointer dereference at virtual address 00000000 Aug 16 21:15:46
>> coyote kernel: printing eip:
>> Aug 16 21:15:46 coyote kernel: c015c8db
>> Aug 16 21:15:46 coyote kernel: *pde = 00000000
>> Aug 16 21:15:46 coyote kernel: Oops: 0002 [#1]
>> Aug 16 21:15:46 coyote kernel: Modules linked in: tuner tvaudio
>> bttv video_buf btcx_risc eeprom snd_seq_oss snd_seq _midi_event
>> snd_seq snd_pcm_oss snd_mixer_oss snd_bt87x snd_intel8x0
>> snd_ac97_codec snd_pcm snd_timer snd_page_allo c snd_mpu401_uart
>> snd_rawmidi snd_seq_device snd forcedeth sg Aug 16 21:15:46 coyote
>> kernel: CPU: 0
>> Aug 16 21:15:46 coyote kernel: EIP: 0060:[<c015c8db>] Not
>> tainted Aug 16 21:15:46 coyote kernel: EFLAGS: 00210206
>> (2.6.8-rc4) Aug 16 21:15:46 coyote kernel: EIP is at
>> prune_icache+0x6b/0x1b0 Aug 16 21:15:46 coyote kernel: eax:
>> 00000000 ebx: dffe0fd0 ecx: d3eb8b80 edx: c0341660 Aug 16
>> 21:15:46 coyote kernel: esi: dffe0fc8 edi: 0000005a ebp:
>> d3eb8b94 esp: d3eb8b74 Aug 16 21:15:46 coyote kernel: ds: 007b
>> es: 007b ss: 0068 Aug 16 21:15:46 coyote kernel: Process yum
>> (pid: 30892, threadinfo=d3eb8000 task=cf6bf7b0) Aug 16 21:15:46
>> coyote kernel: Stack: dffe0448 00000000 00000059 dffe0450 df58d0d0
>> 00000080 00000000 d3eb8000 Aug 16 21:15:46 coyote kernel:
>> d3eb8ba0 c015ca5f 00000080 d3eb8bd4 c0135b14 00000080 000000d2
>> 0108bf00 Aug 16 21:15:46 coyote kernel: 00000000 00021087
>> 00000080 00000000 f7ffea20 0000000a d3eb8c50 00000000 Aug 16
>> 21:15:46 coyote kernel: Call Trace:
>> Aug 16 21:15:46 coyote kernel: [<c01044ef>] show_stack+0x7f/0xa0
>> Aug 16 21:15:46 coyote kernel: [<c0104688>]
>> show_registers+0x158/0x1b0 Aug 16 21:15:46 coyote kernel:
>> [<c01047e6>] die+0x66/0xd0 Aug 16 21:15:46 coyote kernel:
>> [<c01109de>] do_page_fault+0x28e/0x548 Aug 16 21:15:46 coyote
>> kernel: [<c010415d>] error_code+0x2d/0x38 Aug 16 21:15:46 coyote
>> kernel: [<c015ca5f>] shrink_icache_memory+0x3f/0x50 Aug 16
>> 21:15:46 coyote kernel: [<c0135b14>] shrink_slab+0x134/0x170 Aug
>> 16 21:15:46 coyote kernel: [<c0136954>]
>> try_to_free_pages+0xa4/0x160 Aug 16 21:15:46 coyote kernel:
>> [<c012fc23>] __alloc_pages+0x1b3/0x320 Aug 16 21:15:46 coyote
>> kernel: [<c0139a8f>] do_anonymous_page+0x5f/0x180 Aug 16 21:15:46
>> coyote kernel: [<c0139c11>] do_no_page+0x61/0x310 Aug 16 21:15:46
>> coyote kernel: [<c013a097>] handle_mm_fault+0xd7/0x160 Aug 16
>> 21:15:46 coyote kernel: [<c01108a0>] do_page_fault+0x150/0x548
>> Aug 16 21:15:46 coyote kernel: [<c010415d>] error_code+0x2d/0x38
>> Aug 16 21:15:46 coyote kernel: [<c012c279>]
>> do_generic_mapping_read+0x129/0x430 Aug 16 21:15:46 coyote kernel:
>> [<c012c836>] __generic_file_aio_read+0x1b6/0x1f0 Aug 16 21:15:46
>> coyote kernel: [<c012c8c2>] generic_file_aio_read+0x52/0x70 Aug
>> 16 21:15:46 coyote kernel: [<c0145898>] do_sync_read+0x78/0xa0
>> Aug 16 21:15:46 coyote kernel: [<c014598a>] vfs_read+0xca/0x140
>> Aug 16 21:15:46 coyote kernel: [<c0145c2b>] sys_read+0x4b/0x80
>> Aug 16 21:15:46 coyote kernel: [<c0103f61>]
>> sysenter_past_esp+0x52/0x71 Aug 16 21:15:46 coyote kernel: Code:
>> 89 10 a1 60 16 34 c0 89 58 04 89 03 c7 43 04 60 16 34 c0 89
>>
>> yum did a segfault about that time. yum is nice code, when
>> it fscking works, which is maybe half the time on 2 different
>> FC2 machines here now.
>
>Although an Oops is always the kernel's (or bad hardware's) fault.
>So in this case you can let yum off the hook :)
>
>> So we're back to the dentry_cache thing... Duh, NO!, this is in
>> prune_icache, not prune_dcache, presumably slightly different.
>
>Yeah, both are going to cause cache shrinking to stop working.
>
>> As far as bad hardware is concerned, warranty time is running out.
>> I need something plausible to take back to tcwo as a good reason
>> for requesting a 'blanket rma' on the whole thing, would they
>> please send me another.
>
>Not too sure really. At this stage keep trying patches that you get
>sent :P

I just had another but this ones a bit different:

Aug 19 04:22:11 coyote kernel: ------------[ cut here ]------------
Aug 19 04:22:11 coyote kernel: kernel BUG at fs/buffer.c:805!
Aug 19 04:22:11 coyote kernel: invalid operand: 0000 [#1]
Aug 19 04:22:11 coyote kernel: Modules linked in: eeprom snd_seq_oss
snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_bt87x
snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd_page_alloc
snd_mpu401_uart snd_rawmidi snd_seq_device snd forcedeth sg
Aug 19 04:22:11 coyote kernel: CPU: 0
Aug 19 04:22:11 coyote kernel: EIP: 0060:[<c0147d77>] Not
tainted
Aug 19 04:22:11 coyote kernel: EFLAGS: 00010246 (2.6.8-rc4)
Aug 19 04:22:11 coyote kernel: EIP is at
remove_inode_buffers+0x77/0x90
Aug 19 04:22:11 coyote kernel: eax: 00000000 ebx: d7de519c ecx:
d7deb99c edx: d7deb974
Aug 19 04:22:11 coyote kernel: esi: d7de50c8 edi: 00000001 ebp:
c198bedc esp: c198becc
Aug 19 04:22:11 coyote kernel: ds: 007b es: 007b ss: 0068
Aug 19 04:22:11 coyote kernel: Process kswapd0 (pid: 66,
threadinfo=c198b000 task=c1978050)
Aug 19 04:22:11 coyote kernel: Stack: d7de50c8 d7de50d0 d7de50c8
00000057 c198bf04 c015c985 d7de50c8 00000000
Aug 19 04:22:11 coyote kernel: 00000057 d7de5290 e50ac0d0
00000080 00000000 c198b000 c198bf10 c015ca5f
Aug 19 04:22:11 coyote kernel: 00000080 c198bf44 c0135b14
00000080 000000d0 01779600 00000000 0002d1f3
Aug 19 04:22:11 coyote kernel: Call Trace:
Aug 19 04:22:11 coyote kernel: [<c01044ef>] show_stack+0x7f/0xa0
Aug 19 04:22:11 coyote kernel: [<c0104688>]
show_registers+0x158/0x1b0
Aug 19 04:22:11 coyote kernel: [<c01047e6>] die+0x66/0xd0
Aug 19 04:22:12 coyote kernel: [<c0104bc3>] do_invalid_op+0xb3/0xc0
Aug 19 04:22:12 coyote kernel: [<c010415d>] error_code+0x2d/0x38
Aug 19 04:22:12 coyote kernel: [<c015c985>] prune_icache+0x115/0x1b0
Aug 19 04:22:12 coyote kernel: [<c015ca5f>]
shrink_icache_memory+0x3f/0x50
Aug 19 04:22:12 coyote kernel: [<c0135b14>] shrink_slab+0x134/0x170
Aug 19 04:22:12 coyote kernel: [<c0136bb9>] balance_pgdat+0x1a9/0x1f0
Aug 19 04:22:12 coyote kernel: [<c0136cbf>] kswapd+0xbf/0xd0
Aug 19 04:22:12 coyote kernel: [<c01023f1>]
kernel_thread_helper+0x5/0x14
Aug 19 04:22:12 coyote kernel: Code: 0f 0b 25 03 e5 0b 30 c0 eb c4 31
ff eb de 0f 0b 36 04 e5 0b

The system is still up but its 100 megs into swap so I'm going to
reboot without changing anything. Is this one traceable?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.24% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/