Re: [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers
From: Pavel Begunkov
Date: Tue May 26 2026 - 10:51:13 EST
On 5/25/26 16:31, Willem de Bruijn wrote:
Willem de Bruijn wrote:
Willem de Bruijn wrote:
Willem de Bruijn wrote:
lazyming wrote:
pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
the old skb_shared_info header into a new buffer via memcpy(), which
includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.
These functions are not supposed to maintain zerocopy frags.
Both call skb_orphan_frags.
I think what may need to happen is to invert the order of that call
and the memcpy. Current code:
memcpy((struct skb_shared_info *)(data + size),
skb_shinfo(skb), offsetof(struct skb_shared_info, frags[0]));
if (skb_orphan_frags(skb, gfp_mask)) {
skb_kfree_head(data);
return -ENOMEM;
}
Never mind. This actually corresponds to the first Sashiko report you
mentioned: if zerocopy skbs are converted, then the memcpy prior to
that call will have stale state.
For skbs where skb_orphan_frags does not do a deep copy, we do need to
take this extra reference.
Reviewed-by: Willem de Bruijn <willemb@xxxxxxxxxx>
Not sure the potential preexisting issue is reachable.
Vhost-net and other zerocopy that predates MSG_ZEROCOPY does not
refcount ubuf_info. Instead it calls skb_copy_ubufs on skb_clone.
So if such an skb reaches pskb_expand_head, it should be guaranteed to
not be a clone. Same for the carve methods added later.
But, the commit that added zerocopy, commit a6686f2f382b
("skbuff: skb supports zero-copy buffers"), included this
pksb_expand_head call to skb_copy_ubufs from the start. That implies
that was expected to be reachable. I just don't see how yet.
If it is reachable, then all that is needed is to clear shinfo->flags.
Or more neatly,
skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
Also, I'm not the expert on more recent managed frags
(SKBFL_MANAGED_FRAG_REFS).
For that one, pages are guaranteed to be alive as long as the
ubuf_info is not destroyed, hence we don't hold per shinfo
refs. IOW, the lifetime of the pages is bound to the ubuf_info.
That calls skb_zcopy_downgrade_managed in pskb_expand_head, but not in
the two other functions with memcpy before skb_copy_ubufs:
pskb_carve_inside_header and pskb_carve_inside_nonlinear.
I assume because those shorten the skb, so no risk of getting mixed
mode refcounted and non-refcounted frags?
From a quick glance, if reachable, they should "downgrade", otherwise
they leak pages. The new data inherits SKBFL_MANAGED_FRAG_REFS and
ubuf_info but takes additional references with skb_frag_ref(). I'll
take a closer look.
In general zerocopy can be split in refcounted and non-refcounted.
Refcounted zerocopy will not downgrade in these cases, so will not
modify shinfo->flags after memcpy.
Non-refcounted should always get converted to copy in skb_clone,
so will not enter the skb_cloned() branch here.
If in doubt maybe warrants a rare WARN_ON_ONCE patch.
--
Pavel Begunkov