Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

From: Matthew Wilcox
Date: Tue Jan 16 2018 - 12:43:25 EST


On Tue, Jan 16, 2018 at 10:54:27AM -0600, Christopher Lameter wrote:
> On Tue, 16 Jan 2018, Matthew Wilcox wrote:
>
> > I think that's a good thing! /proc/slabinfo really starts to get grotty
> > above 16 bytes. I'd like to chop off "_cache" from the name of every
> > single slab! If ext4_allocation_context has to become ext4_alloc_ctx,
> > I don't think we're going to lose any valuable information.
>
> Ok so we are going to cut off at 16 charaacters? Sounds good to me.

Excellent!

> > > struct kmem_cache_attr {
> > > char *name;
> > > size_t size;
> > > size_t align;
> > > slab_flags_t flags;
> > > unsigned int useroffset;
> > > unsinged int usersize;
> > > void (*ctor)(void *);
> > > kmem_isolate_func *isolate;
> > > kmem_migrate_func *migrate;
> > > ...
> > > }
> >
> > In these slightly-more-security-conscious days, it's considered poor
> > practice to have function pointers in writable memory. That was why
> > I wanted to make the kmem_cache_attr const.
>
> Sure this data is never changed. It can be const.

It's changed at initialisation. Look:

kmem_cache_create(const char *name, size_t size, size_t align,
slab_flags_t flags, void (*ctor)(void *))
s = create_cache(cache_name, size, size,
calculate_alignment(flags, align, size),
flags, ctor, NULL, NULL);

The 'align' that ends up in s->align, is not the user-specified align.
It's also dependent on runtime information (cache_line_size()), so it
can't be calculated at compile time.

'flags' also gets mangled:
flags &= CACHE_CREATE_MASK;


> I am not married to either way of specifying the sizes. unsigned int would
> be fine with me. SLUB falls back to the page allocator anyways for
> anything above 2* PAGE_SIZE and I think we can do the same for the other
> allocators as well. Zeroing or initializing such a large memory chunk is
> much more expensive than the allocation so it does not make much sense to
> have that directly supported in the slab allocators.

The only slabs larger than 4kB on my system right now are:
kvm_vcpu 0 0 19136 1 8 : tunables 8 4 0 : slabdata 0 0 0
net_namespace 1 1 6080 1 2 : tunables 8 4 0 : slabdata 1 1 0

(other than the fake slabs for kmalloc)

> Some platforms support 64K page size and I could envision a 2M page size
> at some point. So I think we cannot use 16 bits there.
>
> If no one objects then I can use unsigned int there again.

unsigned int would be my preference.