Re: [PATCH 2/3] zram: support page-based parallel write

From: Minchan Kim
Date: Mon Oct 24 2016 - 01:58:31 EST


On Mon, Oct 24, 2016 at 02:20:44PM +0900, Sergey Senozhatsky wrote:
> Hi Minchan,
>
> On (10/24/16 13:47), Minchan Kim wrote:
> > Hi Sergey,
> >
> > > > +static void zram_unplug(struct blk_plug_cb *cb, bool from_schedule)
> > > > +{
> > > > + spin_lock(&workers.req_lock);
> > > > + if (workers.nr_req)
> > > > + worker_wake_up();
> > > > + spin_unlock(&workers.req_lock);
> > > > + kfree(cb);
> > > > +}
> > > > +
> > > > +static int zram_check_plugged(void)
> > > > +{
> > > > + return !!blk_check_plugged(zram_unplug, NULL,
> > > > + sizeof(struct blk_plug_cb));
> > > > +}
> > >
> > > I'm having some troubles understanding the purpose of zram_check_plugged().
> > > it's a global symbol, can you just use it directly? otherwise we are
> > > doing additional kmalloc/kfree, spin_lock/unlock and so on.
> >
> > I don't understnad it. Why does it that use zram_check_plugged directly reduce
> > count things you mentioned?
> > >
> > > what am I missing? current->plug? can it affect us? how?
> >
> > Sorry. I can't understand your point.
>
> I meant that every blk_check_plugged() is
>
> struct blk_plug_cb *blk_check_plugged(blk_plug_cb_fn unplug, void *data,
> int size)
> {
> struct blk_plug *plug = current->plug;
> struct blk_plug_cb *cb;
>
> if (!plug)
> return NULL;
>
> list_for_each_entry(cb, &plug->cb_list, list)
> if (cb->callback == unplug && cb->data == data)
> return cb;

Normally, this routine will check and bail out if it has been plugged
rightly so it would be not too many allocation in there.

Having said that, there is no need to allocate cb in block layer.
drive can allocate one time and reuse it with passing it to the
blk_check_plugged. I was tempted to introduce the API into block layer
but it was just optimization/easy stuff once this patchset settle down
so I didn't consider in this patchset.

>
> /* Not currently on the callback list */
> BUG_ON(size < sizeof(*cb));
> cb = kzalloc(size, GFP_ATOMIC);
> if (cb) {
> cb->data = data;
> cb->callback = unplug;
> list_add(&cb->list, &plug->cb_list);
> }
> return cb;
> }
>
> which is extra kzalloc/kfree/etc. do we really need to do it all the time?
> thus my question -- what am I missing (aka educate me)?
>
> > > hm... no real objection, but exporing this sysfs attr can be very hacky
> > > and difficult for people...
> >
> > We have been used sysfs for tune the zram for a long time.
> > Please suggest ideas if you have better. :)
>
> yeah, but this one feels like a super-hacky knob. basically
>
> "enable when you can't tweak your usage patterns. this will tweak the driver".
>
> so I'd probably prefer to keep it hidden for now (may be eventually
> we will come to some "out-of-zram" solution. but the opposition may
> be "fix your usage pattern").

Frankly speaking, I tend to agree.

As I mentioned in cover-letter or somethine, I don't want to make this knob.
A option is we admit it's trade-off. So, if someone enables this config,
he will lost random/direct IO performance at this moment while he can get much
benefit buffered sequential read/write.
What do you think?

>
> besides, you make this sysfs attr .config dependent
>
> > +#ifdef CONFIG_ZRAM_ASYNC_IO
> > +static DEVICE_ATTR_RW(use_aio);
> > +#endif
> >
> > static struct attribute *zram_disk_attrs[] = {
> > &dev_attr_disksize.attr,
> > @@ -1231,6 +1666,9 @@ static struct attribute *zram_disk_attrs[] = {
> > &dev_attr_mem_used_max.attr,
> > &dev_attr_max_comp_streams.attr,
> > &dev_attr_comp_algorithm.attr,
> > +#ifdef CONFIG_ZRAM_ASYNC_IO
> > + &dev_attr_use_aio.attr,
> > +#endif
>
> so this knob is not even guaranteed to be there all the time.
>
> I wish I could suggest any sound alternative, but I don't have one
> at the moment. May be I'll have a chance to speak to block-dev people
> next week.

Okay. But I think it's not a good idea to hurt wb context you mentioned.
IOW, IO queuing could be parallelized by multiple wb context but
servicing(i.e., compression) should be done in zram contexts, not
wb context.

Thanks.