Re: [PATCH v2] erofs: use kmap_local_page() only for erofs_bread()
From: Fabio M. De Francesco
Date: Wed Oct 19 2022 - 14:17:12 EST
On Wednesday, October 19, 2022 1:36:55 AM CEST Gao Xiang wrote:
> On Wed, Oct 19, 2022 at 01:21:27AM +0200, Fabio M. De Francesco wrote:
> > On Tuesday, October 18, 2022 11:29:21 PM CEST Gao Xiang wrote:
>
> ...
>
> >
> > > One of what I need to care is nested kmap() usage,
> > > some unmap/remap order cannot be simply converted to kmap_local()
> >
> > Correct about nesting. If we map A and then B, we must unmap B and then A.
> >
> > However, as you seem to convey, not always unmappings in right order
(stack
> > based) is possible, sometimes because very long functions have many loop's
> > breaks and many goto exit labels.
> >
> > > but I think
> > > it's not the case for erofs_bread(). Actually EROFS has one of such
nested
> > > kmap() usage, but I don't really care its performance on HIGHMEM
platforms,
> > > so I think kmap() is still somewhat useful compared to kmap_local() from
> > this
> > > point of view],
> >
fs/erofs conversions are in our (Ira's and my) list. So I'm am happy to see
that we can delete some entries because of your changes. :-)
> > In Btrfs I solved (thanks to David S.' advice) by mapping only one of two
> > pages, only the one coming from the page cache.
> >
> > The other page didn't need the use of kmap_local_page() because it was
> > allocated in the filesystem with "alloc_page(GFP_NOFS)". GFP_NOFS won't
ever
> > allocate from ZONE_HIGHMEM, therefore a direct page_address() could avoid
the
> > mapping and the nesting issues.
> >
> > Did you check if you may solve the same way?
>
> That is not simple. Currently we have compressed pages and decompressed
> pages (page cache or others), and they can be unmapped when either data
> is all consumed, so compressed pages can be unmapped first, or
> decompressed pages can be unmapped first. That quite depends on which
> pages goes first.
>
> I think such usage is a quite common pattern for decoder or encoder,
> you could take a look at z_erofs_lzma_decompress() in
> fs/erofs/decompressor_lzma.c.
I haven't yet read that code, however I may attempt to propose a pattern for
solving this kinds of issue, I mean where you don't know which page got mapped
last...
It's not elegant but it may work. You have compressed and decompressed pages
and you can't know in advance what page should be unmapped first because you
can't know in which order they where mapped, right?
I'd use a variable to save two different values, each representing the last
page mapped. When the code gets to the unmapping block (perhaps in an "out"
label), just check what that variable contains. Depending on that value, say
'c' or 'd', you will be able to know what must be unmapped for first. An "if /
else" can do the work.
What do you think of this?
> So kmap() is still useful for such cases
> since I don't really care the HIGHMEM performance but correctness, but
> other alternative could churn/complex the map/unmap/remap pattern.
Sooner or later someone will have to address this issue and remove those
kmap() call sites. We are working on this and hope to always figure out a way
to work it out.
I hope that what I wrote above may help, although I'm writing on a purely
theoretically bases, since, as said, I haven't yet seen that code.
If due to my poor English I've not been able to convey my thoughts please let
me know, so that I'll try to reword.
Thanks,
Fabio
> Thanks,
> Gao Xiang
>
> >
> > A little group of people are working to remove all kmap() and
kmap_atomic() we
> > meet across the whole kernel. I have not yet encountered conversions which
> > cannot be made. Sometimes we may refactor, if what I said above cannot
apply.
> >
> > > but in order to make it all work properly, I will try to do
> > > stress test with 32-bit platform later.
> >
> > I use QEMU/KVM x86_32 VM, 6GB RAM, and a kernel with HIGHMEM64G enabled
and an
> > openSUSE Tumbleweed 32 distro. I've heard that Debian too provides an
x86_32
> > distribution.
> >
> > > Since it targets for the next cycle
> > > 6.2, I will do a full stress test in the next following weeks.
> > >
> > > Thanks,
> > > Gao Xiang
> > >
> >
> > Thanks,
> >
> > Fabio
> >
>