Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig_size is enabled

From: Feng Tang
Date: Mon Oct 14 2024 - 10:21:11 EST


On Mon, Oct 14, 2024 at 03:12:09PM +0200, Vlastimil Babka wrote:
> On 10/14/24 14:52, Feng Tang wrote:
> > On Mon, Oct 14, 2024 at 10:53:32AM +0200, Vlastimil Babka wrote:
> >> On 10/14/24 09:52, Feng Tang wrote:
> >> > On Fri, Oct 04, 2024 at 05:52:10PM +0800, Vlastimil Babka wrote:
> >> > Thanks for the suggestion!
> >> >
> >> > As there were error report about the NULL slab for big kmalloc object, how
> >> > about the following code for
> >> >
> >> > __do_krealloc(const void *p, size_t new_size, gfp_t flags)
> >> > {
> >> > void *ret;
> >> > size_t ks = 0;
> >> > int orig_size = 0;
> >> > struct kmem_cache *s = NULL;
> >> >
> >> > /* Check for double-free. */
> >> > if (likely(!ZERO_OR_NULL_PTR(p))) {
> >> > if (!kasan_check_byte(p))
> >> > return NULL;
> >> >
> >> > ks = ksize(p);
> >>
> >> I think this will result in __ksize() doing
> >> skip_orig_size_check(folio_slab(folio)->slab_cache, object);
> >> and we don't want that?
> >
> > I think that's fine. As later code will re-set the orig_size anyway.
>
> But you also read it first.
>
> >> > /* Some objects have no orig_size, like big kmalloc case */
> >> > if (is_kfence_address(p)) {
> >> > orig_size = kfence_ksize(p);
> >> > } else if (virt_to_slab(p)) {
> >> > s = virt_to_cache(p);
> >> > orig_size = get_orig_size(s, (void *)p);
>
> here.

Aha, you are right!

>
> >> > }
>
> >> Also the checks below repeat some of the checks of ksize().
> >
> > Yes, there is some redundancy, mostly the virt_to_slab()
> >
> >> So I think in __do_krealloc() we should do things manually to determine ks
> >> and not call ksize(). Just not break any of the cases ksize() handles
> >> (kfence, large kmalloc).
> >
> > OK, originally I tried not to expose internals of __ksize(). Let me
> > try this way.
>
> ksize() makes assumptions that a user outside of slab itself is calling it.
>
> But we (well mostly Kees) also introduced kmalloc_size_roundup() to avoid
> querying ksize() for the purposes of writing beyond the original
> kmalloc(size) up to the bucket size. So maybe we can also investigate if the
> skip_orig_size_check() mechanism can be removed now?

I did a quick grep, and fortunately it seems that the ksize() user are
much less than before. We used to see some trouble in network code, which
is now very clean without the need to skip orig_size check. Will check
other call site later.

> Still I think __do_krealloc() should rather do its own thing and not call
> ksize().

Yes. I made some changes:

static __always_inline __realloc_size(2) void *
__do_krealloc(const void *p, size_t new_size, gfp_t flags)
{
void *ret;
size_t ks = 0;
int orig_size = 0;
struct kmem_cache *s = NULL;

/* Check for double-free. */
if (unlikely(ZERO_OR_NULL_PTR(p)))
goto alloc_new;

if (!kasan_check_byte(p))
return NULL;

if (is_kfence_address(p)) {
ks = orig_size = kfence_ksize(p);
} else {
struct folio *folio;

folio = virt_to_folio(p);
if (unlikely(!folio_test_slab(folio))) {
/* Big kmalloc object */
WARN_ON(folio_size(folio) <= KMALLOC_MAX_CACHE_SIZE);
WARN_ON(p != folio_address(folio));
ks = folio_size(folio);
} else {
s = folio_slab(folio)->slab_cache;
orig_size = get_orig_size(s, (void *)p);
ks = s->object_size;
}
}

/* If the old object doesn't fit, allocate a bigger one */
if (new_size > ks)
goto alloc_new;

/* Zero out spare memory. */
if (want_init_on_alloc(flags)) {
kasan_disable_current();
if (orig_size && orig_size < new_size)
memset((void *)p + orig_size, 0, new_size - orig_size);
else
memset((void *)p + new_size, 0, ks - new_size);
kasan_enable_current();
}

/* Setup kmalloc redzone when needed */
if (s && slub_debug_orig_size(s)) {
set_orig_size(s, (void *)p, new_size);
if (s->flags & SLAB_RED_ZONE && new_size < ks)
memset_no_sanitize_memory((void *)p + new_size,
SLUB_RED_ACTIVE, ks - new_size);
}

p = kasan_krealloc((void *)p, new_size, flags);
return (void *)p;

alloc_new:
ret = kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _RET_IP_);
if (ret && p) {
/* Disable KASAN checks as the object's redzone is accessed. */
kasan_disable_current();
memcpy(ret, kasan_reset_tag(p), orig_size ?: ks);
kasan_enable_current();
}

return ret;
}

Thanks,
Feng