Re: [PATCH] zsmalloc: fix migrate_zspage-zs_free race condition

From: Russell Knize
Date: Wed Jan 20 2016 - 10:21:48 EST


Yes, I saw your v5 and have already started testing it. I suspect it
will be stable, as the key for us was to set that bit before the
store. We were only seeing it on ARM32, but those platforms tend
perform compaction far more often due to the memory pressure. We
don't see it at all anymore.

Honestly, at first I didn't think setting the bit would help that much
as I assumed it was the barrier in the clear_bit_unlock() that
mattered. Then I saw the same sort of race happening in the page
migration stuff I've been working on. I had done the same type of
"optimization" there and in fact did not call unpin_tag() at all after
updating the object handles with the bit dropped.

Russ

On Wed, Jan 20, 2016 at 1:00 AM, Minchan Kim <minchan@xxxxxxxxxx> wrote:
> Hello Russ,
>
> On Tue, Jan 19, 2016 at 09:47:12AM -0600, Russell Knize wrote:
>> Just wanted to ack this, as we have been seeing the same problem (weird
>> race conditions during compaction) and fixed it in the same way a few
>> weeks ago (resetting the pin bit before recording the obj).
>> Russ
>
> First of all, thanks for your comment.
>
> The patch you tested have a problem although it's really subtle(ie,
> it doesn't do store tearing when I disassemble ARM{32|64}) but it
> could have a problem potentially for other architecutres or future ARM.
> For right fix, I sent v5 - https://lkml.org/lkml/2016/1/18/263.
> If you can prove it fixes your problem, please Tested-by to the thread.
> It's really valuable to do testing for stable material.
>
> Thanks!