Re: hanging aio process

From: Sebastian Ott
Date: Wed May 21 2014 - 10:12:33 EST


Hi,

On Wed, 21 May 2014, Benjamin LaHaise wrote:
> On Wed, May 21, 2014 at 10:48:55AM +0200, Sebastian Ott wrote:
> > >
> > > I already did that and it didn't change, always 1 + (1UL << 31) in all
> > > cases, before and after percpu_ref_kill(&ctx->reqs). I'm not really
> > > familiar with this percpu_ref stuff but it looks like the initial
> > > reference is dropped asynchronous.
> >
> >
> > cat /sys/kernel/debug/tracing/trace
>
> Your trace still isn't monitoring aio_complete(). You need to check if
> aio_complete() is getting called in order to determine if the bug is in
> the core aio code or not.

Yes, sry about that there were just too many of them. But I was able to
reproduce the problem with fio writing a little less data. Sadly it's
still much tracing data - a compressed archive is attached.

The number of aio_complete invocations is the same for the good and the
bad case:

for T in trace.bad trace.good ;do wc $T ;done
49156 294920 3735651 trace.bad
49159 294939 3735901 trace.good

for T in trace.bad trace.good ;do grep aio_complete $T | wc ;done
49120 294720 3733120
49120 294720 3733120

Regards,
Sebastian

Attachment: traces.tar.xz
Description: application/xz