Re: PROBLEM: Silent data corruption when using sendfile()
From: Eric Dumazet
Date: Sat Jul 14 2012 - 07:06:08 EST
On Sat, 2012-07-14 at 12:44 +0200, Willy Tarreau wrote:
> On Sat, Jul 14, 2012 at 12:33:24PM +0200, Eric Dumazet wrote:
> > On Sat, 2012-07-14 at 12:13 +0200, Johannes Truschnigg wrote:
> > > On Sat, Jul 14, 2012 at 10:31:36AM +0200, Willy Tarreau wrote:
> > > > > Please Johannes could you try latest kernel tree ?
> > > >
> > > > It would be useful, especially given the amount of changes you performed
> > > > in this area in latest version, it could be very possible that this new
> > > > bug got fixed as a side effect !
> > >
> > > I upgraded to 3.4.4 (identical config as the 3.4.0 build I've been running)
> > > and what can I say - the problem really seems to have disappeared. I performed
> > > about 3700 iterations of my previos tests over the night, and the data always
> > > turned out to be OK, not a single byte turned out kaput!
> > >
> > > I wish I would have tested that earlier, and spared you the noise... well,
> > > maybe someone who runs into a similar problem in the future will have this
> > > discovery save her/him some time and headaches and make her/him just upgrade
> > > kernels :)
> > >
> > > Thanks a lot for your polite and quick responses!
> > >
> >
> > Nice to hear. Now we should make sure we have all needed fixes for prior
> > stable kernels as well !
> >
> > Still trying to understand the issue, since I thought I only did
> > optimizations, not bug fixes. So maybe real bug is still there but its
> > probability of occurrence lowered enough to not hit your workload.
>
> Please note that Johannes tested 3.4.4 while your changes are in 3.5-rc.
>
> I'm wondering whether this patch merged into 3.4.2 one has an impact on
> sendfile :
>
> commit b642cb6a143da812f188307c2661c0357776a9d0
> Author: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxx>
> Date: Tue Jun 5 21:36:33 2012 +0400
>
> radix-tree: fix contiguous iterator
>
> commit fffaee365fded09f9ebf2db19066065fa54323c3 upstream.
>
> This patch fixes bug in macro radix_tree_for_each_contig().
>
> If radix_tree_next_slot() sees NULL in next slot it returns NULL, but following
> radix_tree_next_chunk() switches iterating into next chunk. As result iterating
> becomes non-contiguous and breaks vfs "splice" and all its users.
>
> Willy
>
Hmmm, this is supposed to fix a bug introduced in 3.4, no ?
So 3.3 kernel should work well ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/