Re: [PATCH v5 3/4] zram: zram memory size limitation

From: Joonsoo Kim
Date: Tue Aug 26 2014 - 21:26:00 EST


Hello, Minchan and David.

On Tue, Aug 26, 2014 at 08:22:29AM -0400, David Horner wrote:
> On Tue, Aug 26, 2014 at 3:55 AM, Minchan Kim <minchan@xxxxxxxxxx> wrote:
> > Hey Joonsoo,
> >
> > On Tue, Aug 26, 2014 at 04:37:30PM +0900, Joonsoo Kim wrote:
> >> On Mon, Aug 25, 2014 at 09:05:55AM +0900, Minchan Kim wrote:
> >> > @@ -513,6 +540,14 @@ static int zram_bvec_write(struct zram *zram, struct bio_vec *bvec, u32 index,
> >> > ret = -ENOMEM;
> >> > goto out;
> >> > }
> >> > +
> >> > + if (zram->limit_pages &&
> >> > + zs_get_total_pages(meta->mem_pool) > zram->limit_pages) {
> >> > + zs_free(meta->mem_pool, handle);
> >> > + ret = -ENOMEM;
> >> > + goto out;
> >> > + }
> >> > +
> >> > cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
> >>
> >> Hello,
> >>
> >> I don't follow up previous discussion, so I could be wrong.
> >> Why this enforcement should be here?
> >>
> >> I think that this has two problems.
> >> 1) alloc/free happens unnecessarilly if we have used memory over the
> >> limitation.
> >
> > True but firstly, I implemented the logic in zsmalloc, not zram but
> > as I described in cover-letter, it's not a requirement of zsmalloc
> > but zram so it should be in there. If every user want it in future,
> > then we could move the function into zsmalloc. That's what we
> > concluded in previous discussion.

Hmm...
Problem is that we can't avoid these unnecessary overhead in this
implementation. If we can implement this feature in zram efficiently,
it's okay. But, I think that current form isn't.

> >
> > Another idea is we could call zs_get_total_pages right before zs_malloc
> > but the problem is we cannot know how many of pages are allocated
> > by zsmalloc in advance.
> > IOW, zram should be blind on zsmalloc's internal.
> >
>
> We did however suggest that we could check before hand to see if
> max was already exceeded as an optimization.
> (possibly with a guess on usage but at least using the minimum of 1 page)
> In the contested case, the max may already be exceeded transiently and
> therefore we know this one _could_ fail (it could also pass, but odds
> aren't good).
> As Minchan mentions this was discussed before - but not into great detail.
> Testing should be done to determine possible benefit. And as he also
> mentions, the better place for it may be in zsmalloc, but that
> requires an ABI change.

Why we hesitate to change zsmalloc API? It is in-kernel API and there
are just two users now, zswap and zram. We can change it easily.
I think that we just need following simple API change in zsmalloc.c.

zs_zpool_create(gfp_t gfp, struct zpool_ops *zpool_op)
=>
zs_zpool_create(unsigned long limit, gfp_t gfp, struct zpool_ops
*zpool_op)

It's pool allocator so there is no obstacle for us to limit maximum
memory usage in zsmalloc. It's a natural idea to limit memory usage
for pool allocator.

> Certainly a detailed suggestion could happen on this thread and I'm
> also interested
> in your thoughts, but this patchset should be able to go in as is.
> Memory exhaustion avoidance probably trumps the possible thrashing at
> threshold.
>
> > About alloc/free cost once if it is over the limit,
> > I don't think it's important to consider.
> > Do you have any scenario in your mind to consider alloc/free cost
> > when the limit is over?
> >
> >> 2) Even if this request doesn't do new allocation, it could be failed
> >> due to other's allocation. There is time gap between allocation and
> >> free, so legimate user who want to use preallocated zsmalloc memory
> >> could also see this condition true and then he will be failed.
> >
> > Yeb, we already discussed that. :)
> > Such false positive shouldn't be a severe problem if we can keep a
> > promise that zram user cannot exceed mem_limit.
> >

If we can keep such a promise, why we need to limit memory usage?
I guess that this limit feature is useful for user who can't keep such promise.
So, we should assume that this false positive happens frequently.

> And we cannot avoid the race, nor can we avoid in a low overhead competitive
> concurrent process transient inconsistent states.
> Different views for different observers.
> They are a consequence of the theory of "Special Computational Relativity".
> I am working on a String Unification Theory of Quantum and General CR in LISP.
> ;-)

If we move limit logic to zsmalloc, we can avoid the race by commiting
needed memory size before actual allocation attempt. This commiting makes
concurrent process serialized so there is no race here. There is
possibilty to fail to allocate, but I think this is better than alloc
and free blindlessly depending on inconsistent states.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/