Re: 32-bit Amlogic (ARM) SoC: kernel BUG in kfree()

From: Liang Yang
Date: Mon Mar 25 2019 - 06:03:15 EST


Hi Martin,

On 2019/3/23 5:07, Martin Blumenstingl wrote:
Hi Matthew,

On Thu, Mar 21, 2019 at 10:44 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:

On Thu, Mar 21, 2019 at 09:17:34PM +0100, Martin Blumenstingl wrote:
Hello,

I am experiencing the following crash:
------------[ cut here ]------------
kernel BUG at mm/slub.c:3950!

if (unlikely(!PageSlab(page))) {
BUG_ON(!PageCompound(page));

You called kfree() on the address of a page which wasn't allocated by slab.

I have traced this crash to the kfree() in meson_nfc_read_buf().
my observation is as follows:
- meson_nfc_read_buf() is called 7 times without any crash, the
kzalloc() call returns 0xe9e6c600 (virtual address) / 0x29e6c600
(physical address)
- the eight time meson_nfc_read_buf() is called kzalloc() call returns
0xee39a38b (virtual address) / 0x2e39a38b (physical address) and the
final kfree() crashes
- changing the size in the kzalloc() call from PER_INFO_BYTE (= 8) to
PAGE_SIZE works around that crash

I suspect you're doing something which corrupts memory. Overrunning
the end of your allocation or something similar. Have you tried KASAN
or even the various slab debugging (eg redzones)?
KASAN is not available on 32-bit ARM. there was some progress last
year [0] but it didn't make it into mainline. I tried to make the
patches apply again and got it to compile (and my kernel is still
booting) but I have no idea if it's still working. for anyone
interested, my patches are here: [1] (I consider this a HACK because I
don't know anything about the code which is being touched in the
patches, I only made it compile)

SLAB debugging (redzones) were a great hint, thank you very much for
that Matthew! I enabled:
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB_DEBUG_ON=y
and with that I now get "BUG kmalloc-64 (Not tainted): Redzone
overwritten" (a larger kernel log extract is attached).

I'm starting to wonder if the NAND controller (hardware) writes more
than 8 bytes.
some context: the "info" buffer allocated in meson_nfc_read_buf is
then passed to the NAND controller IP (after using dma_map_single).

Liang, how does the NAND controller know that it only has to send
PER_INFO_BYTE (= 8) bytes when called from meson_nfc_read_buf? all
other callers of meson_nfc_dma_buffer_setup (which passes the info
buffer to the hardware) are using (nand->ecc.steps * PER_INFO_BYTE)
bytes?

NFC_CMD_N2M and CMDRWGEN are different commands. CMDRWGEN needs to set the ecc page size (1KB or 512B) and Pages(2, 4, 8, ...), so PER_INFO_BYTE(= 8) bytes for each ecc page.
I have never used NFC_CMD_N2M to transfer data before, because it is very low efficient. And I do a experiment with the attachment and find on overwritten on my meson axg platform.

Martin, I would appreciate it very much if you would try the attachment on your meson m8b platform.


Regards
Martin


[0] https://lore.kernel.org/patchwork/cover/913212/
[1] https://github.com/xdarklight/linux/tree/arm-kasan-hack-v5.1-rc1

diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c
old mode 100644
new mode 100755
index e858d58..905ef39
--- a/drivers/mtd/nand/raw/meson_nand.c
+++ b/drivers/mtd/nand/raw/meson_nand.c
@@ -527,11 +527,12 @@ static void meson_nfc_dma_buffer_release(struct nand_chip *nand,
static int meson_nfc_read_buf(struct nand_chip *nand, u8 *buf, int len)
{
struct meson_nfc *nfc = nand_get_controller_data(nand);
- int ret = 0;
+ int ret = 0, i;
u32 cmd;
u8 *info;

- info = kzalloc(PER_INFO_BYTE, GFP_KERNEL);
+ info = kzalloc(2 * PER_INFO_BYTE, GFP_KERNEL);
+ memset(info, 0xFD, 2 * PER_INFO_BYTE);
ret = meson_nfc_dma_buffer_setup(nand, buf, len, info,
PER_INFO_BYTE, DMA_FROM_DEVICE);
if (ret)
@@ -543,6 +544,12 @@ static int meson_nfc_read_buf(struct nand_chip *nand, u8 *buf, int len)
meson_nfc_drain_cmd(nfc);
meson_nfc_wait_cmd_finish(nfc, 1000);
meson_nfc_dma_buffer_release(nand, len, PER_INFO_BYTE, DMA_FROM_DEVICE);
+
+ for (i = 0; i < 2 * PER_INFO_BYTE; i++){
+ printk("0x%x ", info[i]);
+ }
+ printk("\n");
+
kfree(info);

return ret;