Re: [PATCH 2/2] f2fs: support data compression

From: Eric Biggers
Date: Tue Oct 29 2019 - 22:55:17 EST


On Tue, Oct 29, 2019 at 04:33:36PM +0800, Chao Yu wrote:
> On 2019/10/28 6:50, Eric Biggers wrote:
> >> +bool f2fs_is_compressed_page(struct page *page)
> >> +{
> >> + if (!page_private(page))
> >> + return false;
> >> + if (IS_ATOMIC_WRITTEN_PAGE(page) || IS_DUMMY_WRITTEN_PAGE(page))
> >> + return false;
> >> + return *((u32 *)page_private(page)) == F2FS_COMPRESSED_PAGE_MAGIC;
> >> +}
> >
> > This code implies that there can be multiple page private structures each of
> > which has a different magic number. But I only see F2FS_COMPRESSED_PAGE_MAGIC.
> > Where in the code is the other one(s)?
>
> I'm not sure I understood you correctly, did you mean it needs to introduce
> f2fs_is_atomic_written_page() and f2fs_is_dummy_written_page() like
> f2fs_is_compressed_page()?
>

No, I'm asking what is the case where the line

*((u32 *)page_private(page)) == F2FS_COMPRESSED_PAGE_MAGIC

returns false?

> >
> >> +
> >> +static void f2fs_set_compressed_page(struct page *page,
> >> + struct inode *inode, pgoff_t index, void *data, refcount_t *r)
> >> +{
> >> + SetPagePrivate(page);
> >> + set_page_private(page, (unsigned long)data);
> >> +
> >> + /* i_crypto_info and iv index */
> >> + page->index = index;
> >> + page->mapping = inode->i_mapping;
> >> + if (r)
> >> + refcount_inc(r);
> >> +}
> >
> > It isn't really appropriate to create fake pagecache pages like this. Did you
> > consider changing f2fs to use fscrypt_decrypt_block_inplace() instead?
>
> We need to store i_crypto_info and iv index somewhere, in order to pass them to
> fscrypt_decrypt_block_inplace(), where did you suggest to store them?
>

The same place where the pages are stored.

> >> +
> >> +void f2fs_destroy_compress_ctx(struct compress_ctx *cc)
> >> +{
> >> + kvfree(cc->rpages);
> >> +}
> >
> > The memory is allocated with kzalloc(), so why is it freed with kvfree() and not
> > just kfree()?
>
> It was allocated by f2fs_*alloc() which will fallback to kvmalloc() once
> kmalloc() failed.

This seems to be a bug in f2fs_kmalloc() -- it inappropriately falls back to
kvmalloc(). As per its name, it should only use kmalloc(). f2fs_kvmalloc()
already exists, so it can be used when the fallback is wanted.

>
> >> +static int lzo_compress_pages(struct compress_ctx *cc)
> >> +{
> >> + int ret;
> >> +
> >> + ret = lzo1x_1_compress(cc->rbuf, cc->rlen, cc->cbuf->cdata,
> >> + &cc->clen, cc->private);
> >> + if (ret != LZO_E_OK) {
> >> + printk_ratelimited("%sF2FS-fs: lzo compress failed, ret:%d\n",
> >> + KERN_ERR, ret);
> >> + return -EIO;
> >> + }
> >> + return 0;
> >> +}
> >
> > Why not using f2fs_err()? Same in lots of other places.
>
> We use printk_ratelimited at some points where we can afford to lose logs,
> otherwise we use f2fs_{err,warn...} to record info as much as possible for
> troubleshoot.
>

It used to be the case that f2fs_msg() was ratelimited. What stops it from
spamming the logs now?

The problem with a bare printk is that it doesn't show which filesystem instance
the message is coming from.

> >> +
> >> + ret = cops->compress_pages(cc);
> >> + if (ret)
> >> + goto out_vunmap_cbuf;
> >> +
> >> + max_len = PAGE_SIZE * (cc->cluster_size - 1) - COMPRESS_HEADER_SIZE;
> >> +
> >> + if (cc->clen > max_len) {
> >> + ret = -EAGAIN;
> >> + goto out_vunmap_cbuf;
> >> + }
> >
> > Since we already know the max length we're willing to compress to (the max
> > length for any space to be saved), why is more space than that being allocated?
> > LZ4_compress_default() will return an error if there isn't enough space, so that
> > error could just be used as the indication to store the data uncompressed.
>
> AFAIK, there is no such common error code returned from all compression
> algorithms indicating there is no room for limited target size, however we need
> that information to fallback to write raw pages. Any better idea?
>

"Not enough room" is the only reasonable way for compression to fail, so all
that's needed is the ability for compression to report errors at all. What
specifically prevents this approach from working?

> >> static void bio_post_read_processing(struct bio_post_read_ctx *ctx)
> >> {
> >> - /*
> >> - * We use different work queues for decryption and for verity because
> >> - * verity may require reading metadata pages that need decryption, and
> >> - * we shouldn't recurse to the same workqueue.
> >> - */
> >
> > Why is it okay (i.e., no deadlocks) to no longer use different work queues for
> > decryption and for verity? See the comment above which is being deleted.
>
> Could you explain more about how deadlock happen? or share me a link address if
> you have described that case somewhere?
>

The verity work can read pages from the file which require decryption. I'm
concerned that it could deadlock if the work is scheduled on the same workqueue.
Granted, I'm not an expert in Linux workqueues, so if you've investigated this
and determined that it's safe, can you explain why?

- Eric