Re: [PATCH] zsmalloc: fix linking bug in init_zspage

From: Sergey Senozhatsky
Date: Mon Aug 13 2018 - 06:52:34 EST


On (08/13/18 15:05), Minchan Kim wrote:
> > From: zhouxianrong <zhouxianrong@xxxxxxxxxx>
> >
> > The last partial object in last subpage of zspage should not be linked
> > in allocation list. Otherwise it could trigger BUG_ON explicitly at
> > function zs_map_object. But it happened rarely.
>
> Could you be more specific? What case did you see the problem?
> Is it a real problem or one founded by review?
[..]
> > Signed-off-by: zhouxianrong <zhouxianrong@xxxxxxxxxx>
> > ---
> > mm/zsmalloc.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> > index 8d87e973a4f5..24dd8da0aa59 100644
> > --- a/mm/zsmalloc.c
> > +++ b/mm/zsmalloc.c
> > @@ -1040,6 +1040,8 @@ static void init_zspage(struct size_class *class, struct zspage *zspage)
> > * Reset OBJ_TAG_BITS bit to last link to tell
> > * whether it's allocated object or not.
> > */
> > + if (off > PAGE_SIZE)
> > + link -= class->size / sizeof(*link);
> > link->next = -1UL << OBJ_TAG_BITS;
> > }
> > kunmap_atomic(vaddr);

Hmm. This can be a real issue. Unless I'm missing something.

So... I might be wrong, but the way I see the bug report is:

When we link objects during zspage init, we do the following:

while ((off += class->size) < PAGE_SIZE) {
link->next = freeobj++ << OBJ_TAG_BITS;
link += class->size / sizeof(*link);
}

Note that we increment the link first, link += class->size / sizeof(*link),
and check for the offset only afterwards. So by the time we break out of
the while-loop the link *might* point to the partial object which starts at
the last page of zspage, but *never* ends, because we don't have next_page
in current zspage. So that's why that object should not be linked in,
because it's not a valid allocates object - we simply don't have space
for it anymore.

zspage [ page 1 ][ page 2 ]
...............................link
[..###]

therefore the last object must be "link - 1" for such cases.

I think, the following change can also do the trick:

while ((off + class->size) < PAGE_SIZE) {
link->next = freeobj++ << OBJ_TAG_BITS;
link += class->size / sizeof(*link);
off += class->size;
}

Once again, I might be wrong on this.
Any thoughts?

-ss