Re: [Bugme-new] [Bug 15709] New: swapper page allocation failure

From: kernel
Date: Mon May 03 2010 - 02:11:27 EST


Anything we can do to investigate this further?

Thanks!
Robert


On Wed, 28 Apr 2010 00:56:01 +0200, Robert Wimmer <kernel@xxxxxxxxxxx>
wrote:
> I've applied the patch against the kernel which I got
> from "git clone ...." resulted in a kernel 2.6.34-rc5.
>
> The stack trace after mounting NFS is here:
> https://bugzilla.kernel.org/attachment.cgi?id=26166
> /var/log/messages after soft lockup:
> https://bugzilla.kernel.org/attachment.cgi?id=26167
>
> I hope that there is any usefull information in there.
>
> Thanks!
> Robert
>
> On 04/27/10 01:28, Trond Myklebust wrote:
>> On Tue, 2010-04-27 at 00:18 +0200, Robert Wimmer wrote:
>>
>>>> Sure. In addition to what you did above, please do
>>>>
>>>> mount -t debugfs none /sys/kernel/debug
>>>>
>>>> and then cat the contents of the pseudofile at
>>>>
>>>> /sys/kernel/debug/tracing/stack_trace
>>>>
>>>> Please do this more or less immediately after you've finished
mounting
>>>> the NFSv4 client.
>>>>
>>>>
>>> I've uploaded the stack trace. It was generated
>>> directly after mounting. Here are the stacks:
>>>
>>> After mounting:
>>> https://bugzilla.kernel.org/attachment.cgi?id=26153
>>> After the soft lockup:
>>> https://bugzilla.kernel.org/attachment.cgi?id=26154
>>> The dmesg output of the soft lockup:
>>> https://bugzilla.kernel.org/attachment.cgi?id=26155
>>>
>>>
>>>> Does your server have the 'crossmnt' or 'nohide' flags set, or does
it
>>>> use the 'refer' export option anywhere? If so, then we might have to
>>>> test further, since those may trigger the NFSv4 submount feature.
>>>>
>>>>
>>> The server has the following settings:
>>> rw,nohide,insecure,async,no_subtree_check,no_root_squash
>>>
>>> Thanks!
>>> Robert
>>>
>>>
>>>
>> That second trace is more than 5.5K deep, more than half of which is
>> socket overhead :-(((.
>>
>> The process stack does not appear to have overflowed, however that
trace
>> doesn't include any IRQ stack overhead.
>>
>> OK... So what happens if we get rid of half of that trace by forcing
>> asynchronous tasks such as this to run entirely in rpciod instead of
>> first trying to run in the process context?
>>
>> See the attachment...
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/