Re: multi-second application stall in open()

From: Jens Axboe
Date: Wed Mar 07 2012 - 13:56:34 EST


On 03/07/2012 05:28 PM, Vivek Goyal wrote:
> On Wed, Mar 07, 2012 at 07:43:42AM -0600, Josh Hunt wrote:
>> On Tue, Mar 6, 2012 at 3:56 PM, Josh Hunt <joshhunt00@xxxxxxxxx> wrote:
>>> I am trying to diagnose a stall in an application when making a call
>>> to open() with O_TRUNC. The issue was originally seen on 2.6.38 and
>>> I've since reproduced on 2.6.35. I believe the last kernel that I did
>>> not see the stall was 2.6.32. The system is under low to moderate cpu
>>> and memory load when the issue occurs. The filesystem is ext2, but I
>>> mounted it as ext4 and still saw the issue.
>>>
>>> To me it appears the O_TRUNC needs to truncate pages in the page
>>> cache, but maybe those are already out for writeback? This causes the
>>> open() to wait until that operation completes at which time it is then
>>> able to proceed. If that's true then it looks like the writeback is
>>> sitting on the queue forever waiting (tens of seconds) to be
>>> dispatched to the device.
>>>
>>> Here's some more info/data I collected:
>>> (Unfortunately the data here and attached are from separate runs so
>>> the #s will not match up file to file.)
>>>
>>> Snippet of an strace when it is hanging:
>>> 1329323286.589182 close(37) = 0 <0.000010>
>>> 1329323286.589212 clock_gettime(0x5 /* CLOCK_??? */, {1329323286,
>>> 587579583}) = 0 <0.000007>
>>> 1329323286.589241 open("/usr/local/etc/abcd/known/test808",
>>> O_WRONLY|O_CREAT|O_TRUNC, 0644) = 37 <76.841586>
>>> 1329323363.441841 fchmod(37, 0644) = 0 <0.000012>
>>> 1329323363.441882 fcntl64(37, F_GETFL) = 0x1 (flags O_WRONLY) <0.000011>
>>> 1329323363.441915 fcntl64(37, F_SETFL, O_WRONLY|O_NONBLOCK) = 0 <0.000010>
>>>
>>> To debug this more I enabled the hang check timer to dump stack when
>>> anything hung for longer than 10s. There I see the the hanging
>>> application appearing to be stuck in sync_page(). See attached dmesg
>>> for traces.
>>>
>>> I've also captured "iostat -xm 1" while the application was running
>>> and have attached a snippet of that when the problem is occurring. I
>>> see 100% utilization for many seconds and then eventually the
>>> operations complete, but the await time is extremely high. To me there
>>> doesn't look like much IO going on which could be starving out any
>>> requests.
>>>
>>> I have not been able to perform a full bisection to better understand
>>> what may be causing this, but do see this on 2.6.38, a little less
>>> frequently but still occurs on 2.6.35, and no longer see the issue on
>>> 2.6.32. If I switch slice_idle to 0 or change to the deadline
>>> scheduler the issue appears to go away in both 2.6.35 and 38.
>
> Is it happening on latest kernel (3.3-rc6) ?
>
> Can you change the IO scheduler to deadline on sda and see if the issue
> still happens. (echo deadline > /sys/block/sda/queue/scheduler).
>
> blktrace of cfq look odd. I see that some IO (async writes) are being
> submitted but CFQ did not dispatch it for a long time. Even some unplugs
> came in still nothing happened. Also no completions are happening during
> that window. Not sure why CFQ refuses to dispatch queued writes.
>
> Request added by flusher.
>
> 8,0 1 36926 5028.546000122 2846 A W 20147012 + 8 <- (8,3)
> 3375152
> 8,0 1 36927 5028.546001798 2846 Q W 20147012 + 8 [flush-8:0]
> 8,0 1 36928 5028.546009900 2846 G W 20147012 + 8 [flush-8:0]
> 8,0 1 36929 5028.546014649 2846 I W 20147012 + 8 ( 4749)
> [flush-8:0]
>
> And this request is dispatched after 22 seconds.
>
> 8,0 1 37056 5050.117337221 162 D W 20147012 + 16 (21571322572) [sync_supers]
>
>
> And it completes fairly fast.
>
> 8,0 0 36522 5050.117686149 9657 C W 20147012 + 16 ( 348928)
> [0]
>
> So not sure why CFQ will hold that request for so long when other IO is
> not happening.
>
> Please try latest kernels and see if deadline has the same issue. If not,
> then we know somehow CFQ is related. If it still happens on latest
> kernels, can you try capturing blktrace again when you are experiencing
> the delays.

I'm seeing something very similar here. While testing the gtk fio
client, I ran a job that issued a lot of random reads to my primary
disk. 64 ios in flight, direct, libaio, 512b random reads. Firefox
essentially froze, windows starting freezing up around me.

I'll try and reproduce, but a quick guess would be that things starting
piling up in fsync() or stalling on writes in general, since we are
heavily starving those.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/