Re: [PATCH] zsmalloc: fix a race with deferred_handles storing
From: Nhat Pham
Date: Tue Jan 31 2023 - 21:29:15 EST
On Tue, Jan 31, 2023 at 5:41 PM Sergey Senozhatsky
<senozhatsky@xxxxxxxxxxxx> wrote:
>
> On (23/01/10 15:17), Nhat Pham wrote:
> [..]
> > #ifdef CONFIG_ZPOOL
> > +static void restore_freelist(struct zs_pool *pool, struct size_class *class,
> > + struct zspage *zspage)
> > +{
> > + unsigned int obj_idx = 0;
> > + unsigned long handle, off = 0; /* off is within-page offset */
> > + struct page *page = get_first_page(zspage);
> > + struct link_free *prev_free = NULL;
> > + void *prev_page_vaddr = NULL;
> > +
> > + /* in case no free object found */
> > + set_freeobj(zspage, (unsigned int)(-1UL));
>
> I'm not following this. I see how -1UL works for link_free, but this
> cast of -1UL to 4 bytes looks suspicious.
(resending this since I forgot to forward this to other recipients)
It is a bit convoluted indeed. But the idea is that for the last object,
the last link is given by:
link->next = -1UL << OBJ_TAG_BITS
And at malloc time, we update freeobj as follows
set_freeobj(zspage, link->next >> OBJ_TAG_BITS);
Which means the freeobj value would be set to something like this:
(-1UL << OBJ_TAG_BITS) >> OBJ_TAG_BITS
I want to emulate this here (i.e in the case we have no free object).
As for the casting, I believe set_freeobj requires an unsigned int for
the second field.
Alternatively, to be 100% safe, we can do something like this:
(unsigned int)((-1UL << OBJ_TAG_BITS) >> OBJ_TAG_BITS)
But I think I got the same result as just (unsigned int)(-1UL)
when I printed out these two values - feel free to
fact check me on this of course.
Let me know what you think about this, or if you have a
cleaner/safer way to handle this edge case :)
>
> > + while (page) {
> > + void *vaddr = kmap_atomic(page);
> > + struct page *next_page;
> > +
> > + while (off < PAGE_SIZE) {
> > + void *obj_addr = vaddr + off;
> > +
> > + /* skip allocated object */
> > + if (obj_allocated(page, obj_addr, &handle)) {
> > + obj_idx++;
> > + off += class->size;
> > + continue;
> > + }
> > +
> > + /* free deferred handle from reclaim attempt */
> > + if (obj_stores_deferred_handle(page, obj_addr, &handle))
> > + cache_free_handle(pool, handle);
> > +
> > + if (prev_free)
> > + prev_free->next = obj_idx << OBJ_TAG_BITS;
> > + else /* first free object found */
> > + set_freeobj(zspage, obj_idx);
> > +
> > + prev_free = (struct link_free *)vaddr + off / sizeof(*prev_free);
> > + /* if last free object in a previous page, need to unmap */
> > + if (prev_page_vaddr) {
> > + kunmap_atomic(prev_page_vaddr);
> > + prev_page_vaddr = NULL;
> > + }
> > +
> > + obj_idx++;
> > + off += class->size;
> > + }
> > +
> > + /*
> > + * Handle the last (full or partial) object on this page.
> > + */
> > + next_page = get_next_page(page);
> > + if (next_page) {
> > + if (!prev_free || prev_page_vaddr) {
> > + /*
> > + * There is no free object in this page, so we can safely
> > + * unmap it.
> > + */
> > + kunmap_atomic(vaddr);
> > + } else {
> > + /* update prev_page_vaddr since prev_free is on this page */
> > + prev_page_vaddr = vaddr;
> > + }
>
> A polite and gentle nit: I'd appreciate it if we honored kernel coding
> styles in zsmalloc a little bit more. Comments, function declarations, etc.
> I'm personally very happy with https://github.com/vivien/vim-linux-coding-style