How to determine if a page can be spliced into an skbuff, or if it should be copied/rejected?
From: David Howells
Date: Thu Apr 13 2023 - 17:27:51 EST
Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Apr 11, 2023 at 05:08:50PM +0100, David Howells wrote:
> > Add a function to handle MSG_SPLICE_PAGES being passed internally to
> > sendmsg(). Pages are spliced into the given socket buffer if possible and
> > copied in if not (ie. they're slab pages or have a zero refcount).
>
> That "ie." would better be "e.g." - that condition is *not* enough for
> tell the unsafe ones from the rest.
>
> sendpage_ok() would be better off called "might_be_ok_to_sendpage()".
> If it's false, we'd better not grab a reference to the page and expect the
> sucker to stay safe until the reference is dropped. However, AFAICS
> it might return true on a page that is not safe in that respect.
>
> What rules do you propose for sendpage users? "Pass whatever page reference
> you want, it'll do the right thing"? Anything short of that would better
> be documented as explicitly as possible...
Hmmm... Fair point. Is everything passed through splice guaranteed to be
safe, I wonder? Probably not because vmsplice(). Does that mean the existing
callers of sendpage_ok() are also making unviable assumptions?
So there are the following 'classes' of memory that I can immediately think
of:
- Zero page Splice (no ref?)
- Kernel core data Splice
- Module core data (vmalloc'd) Splice
- Supervisor stack Copy
- Slab objects Copy
- Page frags Splice
- Other skbuff frags Splice
- Arbitrary pages (eg. sunrpc xdr buf) Splice (probably)
- Ordinary pipe buffers Splice
- Spliced tmpfs Splice
- Spliced pagecache (file/block) Splice
- Spliced DIO file/block Splice
- Vmspliced mmap'd anon Splice (with pin?)
- Vmspliced MAP_SHARED pagecache Splice (with pin?)
- Vmspliced MAP_SHARED DAX Splice?
- Vmspliced MAP_SHARED MTD Splice?
- Vmspliced MAP_SHARED other device Reject? (e.g. graphics card mem)
- Vmspliced /dev/{mem,kmem} Reject?
Question is how to tell that we're looking at something that must be copied or
rejected? sendpage_ok() checks the PG_slab bit and the pagecount, for
example.
David