Re: [PATCH net] net: skbuff: fix missing zerocopy reference in pskb_carve helpers

From: Pavel Begunkov

Date: Tue May 26 2026 - 10:51:13 EST


On 5/25/26 16:31, Willem de Bruijn wrote:
Willem de Bruijn wrote:
Willem de Bruijn wrote:
Willem de Bruijn wrote:
lazyming wrote:
pskb_carve_inside_header() and pskb_carve_inside_nonlinear() both copy
the old skb_shared_info header into a new buffer via memcpy(), which
includes the destructor_arg pointer (uarg) for MSG_ZEROCOPY skbs.

These functions are not supposed to maintain zerocopy frags.

Both call skb_orphan_frags.

I think what may need to happen is to invert the order of that call
and the memcpy. Current code:

memcpy((struct skb_shared_info *)(data + size),
skb_shinfo(skb), offsetof(struct skb_shared_info, frags[0]));
if (skb_orphan_frags(skb, gfp_mask)) {
skb_kfree_head(data);
return -ENOMEM;
}

Never mind. This actually corresponds to the first Sashiko report you
mentioned: if zerocopy skbs are converted, then the memcpy prior to
that call will have stale state.

For skbs where skb_orphan_frags does not do a deep copy, we do need to
take this extra reference.

Reviewed-by: Willem de Bruijn <willemb@xxxxxxxxxx>

Not sure the potential preexisting issue is reachable.

Vhost-net and other zerocopy that predates MSG_ZEROCOPY does not
refcount ubuf_info. Instead it calls skb_copy_ubufs on skb_clone.

So if such an skb reaches pskb_expand_head, it should be guaranteed to
not be a clone. Same for the carve methods added later.

But, the commit that added zerocopy, commit a6686f2f382b
("skbuff: skb supports zero-copy buffers"), included this
pksb_expand_head call to skb_copy_ubufs from the start. That implies
that was expected to be reachable. I just don't see how yet.

If it is reachable, then all that is needed is to clear shinfo->flags.
Or more neatly,

skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;

Also, I'm not the expert on more recent managed frags
(SKBFL_MANAGED_FRAG_REFS).

For that one, pages are guaranteed to be alive as long as the
ubuf_info is not destroyed, hence we don't hold per shinfo
refs. IOW, the lifetime of the pages is bound to the ubuf_info.

That calls skb_zcopy_downgrade_managed in pskb_expand_head, but not in
the two other functions with memcpy before skb_copy_ubufs:
pskb_carve_inside_header and pskb_carve_inside_nonlinear.

I assume because those shorten the skb, so no risk of getting mixed
mode refcounted and non-refcounted frags?

From a quick glance, if reachable, they should "downgrade", otherwise
they leak pages. The new data inherits SKBFL_MANAGED_FRAG_REFS and
ubuf_info but takes additional references with skb_frag_ref(). I'll
take a closer look.

In general zerocopy can be split in refcounted and non-refcounted.

Refcounted zerocopy will not downgrade in these cases, so will not
modify shinfo->flags after memcpy.

Non-refcounted should always get converted to copy in skb_clone,
so will not enter the skb_cloned() branch here.

If in doubt maybe warrants a rare WARN_ON_ONCE patch.

--
Pavel Begunkov