Re: [BUG REPORT] ZSWAP: theoretical race condition issues

From: Minchan Kim
Date: Fri Sep 27 2013 - 00:58:33 EST


Hello Weijie,

On Thu, Sep 26, 2013 at 04:48:03PM +0800, Weijie Yang wrote:
> On Thu, Sep 26, 2013 at 3:57 PM, Minchan Kim <minchan@xxxxxxxxxx> wrote:
> > On Thu, Sep 26, 2013 at 03:26:33PM +0800, Weijie Yang wrote:
> >> On Thu, Sep 26, 2013 at 1:58 PM, Minchan Kim <minchan@xxxxxxxxxx> wrote:
> >> > Hello Weigie,
> >> >
> >> > On Wed, Sep 25, 2013 at 05:33:43PM +0800, Weijie Yang wrote:
> >> >> On Wed, Sep 25, 2013 at 4:31 PM, Bob Liu <lliubbo@xxxxxxxxx> wrote:
> >> >> > On Wed, Sep 25, 2013 at 4:09 PM, Weijie Yang <weijie.yang.kh@xxxxxxxxx> wrote:
> >> >> >> I think I find a new issue, for integrity of this mail thread, I reply
> >> >> >> to this mail.
> >> >> >>
> >> >> >> It is a concurrence issue either, when duplicate store and reclaim
> >> >> >> concurrentlly.
> >> >> >>
> >> >> >> zswap entry x with offset A is already stored in zswap backend.
> >> >> >> Consider the following scenario:
> >> >> >>
> >> >> >> thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page)
> >> >> >>
> >> >> >> thread 1: store new page with the same offset A, alloc a new zswap entry y.
> >> >> >> store finished. shrink_page_list() call __remove_mapping(), and now
> >> >> >> it is not in swap_cache
> >> >> >>
> >> >> >
> >> >> > But I don't think swap layer will call zswap with the same offset A.
> >> >>
> >> >> 1. store page of offset A in zswap
> >> >> 2. some time later, pagefault occur, load page data from zswap.
> >> >> But notice that zswap entry x is still in zswap because it is not
> >> >> frontswap_tmem_exclusive_gets_enabled.
> >> >
> >> > frontswap_tmem_exclusive_gets_enabled is just option to see tradeoff
> >> > between CPU burining by frequent swapout and memory footprint by duplicate
> >> > copy in swap cache and frontswap backend so it shouldn't affect the stability.
> >>
> >> Thanks for explain this.
> >> I don't mean to say this option affects the stability, but that zswap
> >> only realize
> >> one option. Maybe it's better to realize both options for different workloads.
> >
> > "zswap only relize one option"
> > What does it mena? Sorry. I couldn't parse your intention. :)
> > You mean zswap should do something special to support frontswap_tmem_exclusive_gets?
>
> Yes. But I am not sure whether it is worth.
>
> >>
> >> >> this page is with PageSwapCache(page) and page_private(page) = entry.val
> >> >> 3. change this page data, and it become dirty
> >> >
> >> > If non-shared swapin page become redirty, it should remove the page from
> >> > swapcache. If shared swapin page become redirty, it should do CoW so it's a
> >> > new page so that it doesn't live in swap cache. It means it should have new
> >> > offset which is different with old's one for swap out.
> >> >
> >> > What's wrong with that?
> >>
> >> It is really not a right scene for duplicate store. And I can not think out one.
> >> If duplicate store is impossible, How about delete the handle code in zswap?
> >> If it does exist, I think there is a potential issue as I described.
> >
> > You mean "zswap_duplicate_entry"?
> > AFAIR, I already had a question to Seth when zswap was born but AFAIRC,
> > he said that he didn't know exact reason but he saw that case during
> > experiement so copy the code peice from zcache.
> >
> > Do you see the case, too?
>
> Yes, I mean duplicate store.
> I check the /Documentation/vm/frontswap.txt, it mentions "duplicate stores",
> but I am still confused.

It seems that there are two Minchan in LKML.
Other Minchan, not me who have a horrible memory, already was first to
figure it out a few month ago.

https://lkml.org/lkml/2013/1/31/3

/me slaps self.
I'd like to look into that issue more but now I don't have a time.
Just FYI. ;-)

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/