Re: [PATCH] mm: cma: fix corruption cma_sysfs_alloc_pages_count

From: John Hubbard
Date: Wed Mar 24 2021 - 15:46:20 EST


On 3/24/21 12:20 PM, Minchan Kim wrote:
struct cma_stat's lifespan for cma_sysfs is different with
struct cma because kobject for sysfs requires dynamic object
while CMA is static object[1]. When CMA is initialized,
it couldn't use slab to allocate cma_stat since slab was not
initialized yet. Thus, it allocates the dynamic object
in subsys_initcall.

However, the cma allocation can happens before subsys_initcall
then, it goes crash.

Dmitry reported[2]:

..
[ 1.226190] [<c027762f>] (cma_sysfs_alloc_pages_count) from [<c027706f>] (cma_alloc+0x153/0x274)
[ 1.226720] [<c027706f>] (cma_alloc) from [<c01112ab>] (__alloc_from_contiguous+0x37/0x8c)
[ 1.227272] [<c01112ab>] (__alloc_from_contiguous) from [<c1104af9>] (atomic_pool_init+0x7b/0x126)
[ 1.233596] [<c1104af9>] (atomic_pool_init) from [<c0101d69>] (do_one_initcall+0x45/0x1e4)
[ 1.234188] [<c0101d69>] (do_one_initcall) from [<c1101141>] (kernel_init_freeable+0x157/0x1a6)
[ 1.234741] [<c1101141>] (kernel_init_freeable) from [<c0a27fd1>] (kernel_init+0xd/0xe0)
[ 1.235289] [<c0a27fd1>] (kernel_init) from [<c0100155>] (ret_from_fork+0x11/0x1c)

This patch moves those statistic fields of cma_stat into struct cma
and introduces cma_kobject wrapper to follow kobject's rule.

At the same time, it fixes other routines based on suggestions[3][4].

[1] https://lore.kernel.org/linux-mm/YCOAmXqt6dZkCQYs@xxxxxxxxx/
[2] https://lore.kernel.org/linux-mm/fead70a2-4330-79ff-e79a-d8511eab1256@xxxxxxxxx/
[3] https://lore.kernel.org/linux-mm/20210323195050.2577017-1-minchan@xxxxxxxxxx/
[4] https://lore.kernel.org/linux-mm/20210324010547.4134370-1-minchan@xxxxxxxxxx/

Reported-by: Dmitry Osipenko <digetx@xxxxxxxxx>
Tested-by: Dmitry Osipenko <digetx@xxxxxxxxx>
Suggested-by: Dmitry Osipenko <digetx@xxxxxxxxx>
Suggested-by: John Hubbard <jhubbard@xxxxxxxxxx>
Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
---
I belive it's worth to have separate patch rather than replacing
original patch. It will also help to merge without conflict
since we already filed other patch based on it.
Strictly speaking, separating fix part and readbility part
in this patch would be better but it's gray to separate them
since most code in this patch was done while we were fixing
the bug. Since we don't release it yet, I hope it will work.
Otherwise, I can send a replacement patch inclucing all of
changes happend until now with gathering SoB.

If we still have a choice, we should not merge a patch that has a known
serious problem, such as a crash. That's only done if the broken problematic
patch has already been committed to a tree that doesn't allow rebasing,
such as of course the main linux.git.

Here, I *think* it's just in linux-next and mmotm, so we still are allowed
to fix the original patch.

thanks,
--
John Hubbard
NVIDIA