Re: [PATCH 6.10 000/809] 6.10.3-rc3 review

From: Guenter Roeck
Date: Tue Aug 06 2024 - 21:38:46 EST


On 8/6/24 17:49, James Bottomley wrote:
On Wed, 2024-08-07 at 01:24 +0200, Thomas Gleixner wrote:
Cc+: Helge, parisc ML

We're chasing a weird failure which has been tracked down to the
placement of the division library functions (I assume they are
imported
from libgcc).

See the thread starting at:

https://lore.kernel.org/all/718b8afe-222f-4b3a-96d3-93af0e4ceff1@xxxxxxxxxxxx

On Tue, Aug 06 2024 at 21:25, Vlastimil Babka wrote:
On 8/6/24 19:33, Thomas Gleixner wrote:

So this change adds 16 bytes to __softirq() which moves the
division
functions up by 16 bytes. That's all it takes to make the stupid
go
away....

Heh I was actually wondering if the division is somhow messed up
because
maxobj = order_objects() and order_objects() does a division. Now I
suspect
it even more.

check_slab() calls into that muck, but I checked the disassembly of a
working and a broken kernel and the only difference there is the
displacement offset when the code calculates the call address, but
that's as expected a difference of 16 bytes.

Now it becomes interesting.

I added a unused function after __do_softirq() into the softirq text
section and filled it with ASM nonsense so that it occupies exactly
one
page. That moves $$divoI, which is what check_slab() calls, exactly
one
page forward:

    -0000000041218c70 T $$divoI
    +0000000041219c70 T $$divoI

Guess what happens? If falls on it's nose again.

Now with that ASM gunk I can steer the size conveniently. It works up
to:

    0000000041219c50 T $$divoI

and fails for

    0000000041219c60 T $$divoI
    0000000041219c70 T $$divoI

and works again at

    0000000041219c80 T $$divoI

So just on this, you seem to have proved that only exact multiples of
48 work. In terms of how PA-RISC caching works that's completely nuts
... however, there may be something else at work, like stack frame
alignment.


So I added the following:

+extern void testme(void);
+extern unsigned int testsize;
+
+unsigned int testsize = 192;
+
+void __init testme(void)
+{
+       pr_info("TESTME: %lu\n", PAGE_SIZE / testsize);
+}

called that _before_ mm_core_init() from init/main.c and adjusted my
ASM hack to make $$divoI be at:

    0000000041219c70 T $$divoI

again and surprisingly the output is:

    [    0.000000] softirq: TESTME: 21

OK, why is that surprising? 4096/192 is 21 due to integer rounding.


Problem is that it sometimes wrongly returns 16. But not always.
The surprise may be that it is not consistently wrong, even if
divU is at the same location.

Now I went back to the hppa64 gcc version 12.2.0 again and did the
same ASM gunk adjustment so that $$divoI ends up at the offset 0xc70
in the page and the same happens.

So it's not a compiler dependent problem.

But then I added a testme() call to the error path and get:

[    0.000000] softirq: TESTME: 21
[    0.000000]
=====================================================================
========
[    0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16
size 192 sorder 0

Now what's wrong?

Adding more debug:

[    0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16
size 192 sorder 0 21

where the last '21' is the output of the same call which made maxobj
go
south:

 static int check_slab(struct kmem_cache *s, struct slab *slab)
 {
        int maxobj;
@@ -1386,8 +1388,10 @@ static int check_slab(struct kmem_cache
        maxobj = order_objects(slab_order(slab), s->size);
        if (slab->objects > maxobj) {
-               slab_err(s, slab, "objects %u > max %u",
-                       slab->objects, maxobj);
+               testme();
+               slab_err(s, slab, "objects %u > max %u size %u sorder
%u %u",
+                        slab->objects, maxobj, s->size,
slab_order(slab),
+                        order_objects(slab_order(slab), s->size));
                return 0;
        }
        if (slab->inuse > slab->objects) {

I don't know and I don't want to know TBH...

OK, so you're telling us we have a problem with slab_order on parisc
... that's folio_order, so it smells like a parisc bug with
folio_test_large? Unfortuntely I'm a bit pissed in an airport lounge
on my way to the UK, so I've lost access to my pa test rig and can't
test further for a while.


I think the problem is rather that divU, for some unknown reason, sometimes returns
bad results. The divU implementation in libgcc isn't exactly easy to understand.
There may be some context problem, such as the upper bits of some register not
being cleared under some circumstances. Look at my recent commits into
arch/parisc for examples how similar problems survived for a long time in the
Linux kernel.

I'd strongly suspect a qemu emulation problem if it wasn't for those other
problems. Still, I think it is very likely that the problem lies in qemu,
but we might need a confirmation one way or the other from a test on real
hardware.

Thanks,
Guenter