Re: NILFS2 get stuck after bio_alloc() fail

From: Ryusuke Konishi
Date: Sun Jun 14 2009 - 14:03:06 EST


Hi Leandro,
On Sun, 14 Jun 2009 12:32:56 -0300, Leandro Lucarella wrote:
> Ryusuke Konishi, el 14 de junio a las 12:45 me escribiste:
>> Hi,
>> On Sat, 13 Jun 2009 22:32:11 -0300, Leandro Lucarella wrote:
>> > Hi!
>> >
>> > While testing nilfs2 (using 2.6.30) doing some "cp"s and "rm"s, I noticed
>> > sometimes they got stucked in D state, and the kernel had said the
>> > following message:
>> >
>> > NILFS: IO error writing segment
>> >
>> > A friend gave me a hand and after adding some printk()s we found out that
>> > the problem seems to occur when bio_alloc()s inside nilfs_alloc_seg_bio()
>> > fail, making it return NULL; but we don't know how that causes the
>> > processes to get stucked.
>>
>> Thank you for reporting this issue.
>>
>> Could you get stack dump of the stuck nilfs task?
>> It is acquirable as follows if you enabled magic sysrq feature:
>>
>> # echo t > /proc/sysrq-trigger
>>
>> I will dig into the process how it got stuck.
>
> Here is (what I thought it's) the important stuff:
<snip>

> 'rm' is the "original" stuck process, 'umount' got stuck after that, when I
> tried to umount the nilfs (it was mounted in a loop device).
>
> Here is the complete trace:
> http://pastebin.lugmen.org.ar/4931

Thank you for your help.

According to your log, there seems to be a leakage in clear processing
of the writeback flag on pages. I will review the error path of log
writer to narrow down the cause.

Regards,
Ryusuke Konishi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/