Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
From: Linus Torvalds
Date: Thu Jun 04 2026 - 10:45:45 EST
On Wed, 3 Jun 2026 at 23:32, Willy Tarreau <w@xxxxxx> wrote:
>
> I'm using vmsplice() + tee() + splice() in high-performance applications,
> load generators to be precise, and soon a cache. This is super convenient
> and extremely efficient:
>
> - vmsplice() is used to prepare a "master" pipe with data to be sent
> over TCP or kTLS
> - then for each request, we do tee() from this master pipe to per-request
> pipes.
> - the per-request pipes are those that are used to deliver the data to
> the socket via splice().
So most of those would actually not be affected by any of the existing
patches: the pipe->socket splice would remain, the tee() code would
still just take a ref to the page count.
The vmsplice() would change, but looking at your haterm.c sources, it
looks like it's mostly a fairly small thing ("common_response[]" being
16kB).
That is typically *faster* to just copy than look up pages.
HOWEVER.
It looks like you're actually doing exactly the thing that I thought
was crazy and wouldn't even work reliably: you change the
common_response[] contents dynamically *after* the vmsplice, and
depend on the fact that changing it in user space changes the buffer
in the pipe too.
So that would break *entirely* with the vmsplice() changes if I read
the code right (which I might not do) simply because that looks like
it really does require that "wrutably shared buffer after the fact".
Interesting. Because the vmsplice() code uses get_user_pages_fast(),
and honestly, it never pinned the page reliably to the original source
- it breaks COW randomly in one direction or the other after fork()
(and I thouht even after a page-out, but thinking more about it the
swap cache may have made it work for that case).
Uhhuh. That does look like it makes the vmsplice() changes untenable.
But I may be reading your haproxy code entirely wrong.
Linus