Re: [rfc][patch] remove racy sync_page?

From: Jens Axboe
Date: Wed May 31 2006 - 13:47:23 EST


On Wed, May 31 2006, Nick Piggin wrote:
> Hi Jens,
> Sorry, I don't think I gave you any reply to this...

No worries :)

> Jens Axboe wrote:
> >On Mon, May 29 2006, Andrew Morton wrote:
> >
> >>
> >>Mysterious question, that. A few years ago I think Jens tried pulling
> >>unplugging out, but some devices still want it (magneto-optical
> >>storage iirc). And I think we did try removing it, and it caused
> >>hurt.
> >
> >
> >I did, back when we had problems due to the blk_plug_lock being a global
> >one. I first wanted to investigate if plugging still made a difference,
> >otherwise we could've just ripped it out back than and the problem would
> >be solved. But it did get us about a 10% boost on normal SCSI drives
> >(don't think I tested MO drives at all), so it was fixed up.
>
> Interesting. I'd like to know where from. I wonder if my idea of a
> process context plug/unplug would solve it...

As far as I recall, just doing a simple diff between source directories
on the same drive was a pretty good example of where the plugging gained
you some. A little on the benchmarks as well, iirc. I would have _loved_
to rip plugging out at that point, so while I may not remember the
details yet, don't think I did that work if I could have avoided it :-)

> >>>With splice, the mapping can change, so you can have the wrong
> >>>sync_page callback run against the page.
> >>
> >>Oh.
> >
> >
> >Maybe I'm being dense, but I don't see a problem there. You _should_
> >call the new mapping sync page if it has been migrated.
>
> But can some other thread calling lock_page first find the old mapping,
> and then run its ->sync_page which finds the new mapping? While it may
> not matter for anyone in-tree, it does break the API so it would be
> better to either fix it or rip it out than be silently buggy.

It looks ok to me, you can inspect fs/splice.c:pipe_to_file() and see if
you can spot anything.

> >>>The ->pin() calls in pipe_to_file and pipe_to_sendpage?
> >>
> >>One for Jens...
> >
> >
> >splice never incs/decs any inode related reference counts, so if it
> >needs to then yes it's broken. Any references to kernel code that deals
> >with that?
>
> Most code in the VM that has an inode/mapping gets called from the VFS,
> which already does its thing somehow (I guess something like the file
> pins the dentry which pins the inode). An iget might solve it. Or you
> could use the lock_page_nosync() if/when the patch goes in (although I
> don't want that to spread too far just yet).

I'll be sure to look over that, thanks!

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/