Re: [PATCH 1/1] sparc64: unify thread stack sizing and add explicit 32KB stack
From: David Laight
Date: Thu Jun 18 2026 - 05:01:58 EST
On Thu, 18 Jun 2026 00:29:59 -0700
Tony Rodriguez <unixpro1970@xxxxxxxxx> wrote:
> On 6/17/26 10:53 PM, Andreas Larsson wrote:
> > On 2026-06-16 21:58, David Laight wrote:
> >> On Tue, 16 Jun 2026 16:18:33 +0200
> >> Andreas Larsson <andreas.larsson@xxxxxxxxxxx> wrote:
> >>
> >>> On 2026-05-19 09:57, Tony Rodriguez wrote:
> >>>> This patch restructures the thread‑stack sizing logic into a single
> >>>> if / elif / else chain and introduces an explicit 32KB kernel stack
> >>>> for SPARC64. The previous implementation relied on nested conditionals
> >>>> and PAGE_SHIFT‑dependent behavior, which produced 8KB or 16KB stacks
> >>>> depending on configuration. SPARC64 requires a larger,
> >>>> architecture‑specific stack due to its trapframe size, register‑window
> >>>> behavior, and deeper call paths.
> >>>>
> >>>> A reproducible failure case occurs when usbcore is enabled: USB hub
> >>>> enumeration (usb_new_device(), hub_port_connect(), PM/QoS helpers)
> >>>> allocates large on‑stack structures and recurses through several
> >>>> layers of device‑model code. Combined with SPARC64’s trapframe and
> >>>> register‑window overhead, this reliably exhausts a 16KB stack and
> >>>> results in early‑boot panics. A 32KB stack eliminates these failures.
> >>>>
> >>>> The new logic is:
> >>>> SPARC64:
> >>>> THREAD_SIZE = 4 * PAGE_SIZE (32KB)
> >>>> THREAD_SHIFT = PAGE_SHIFT + 2 (log₂(32KB))
> >>>> THREAD_SIZE_ORDER = 2 (4 contiguous pages)
> >>> Yes
> >>>
> >>>> Non‑SPARC64 with PAGE_SHIFT == 13:
> >>>> Retains the existing 16KB stack behavior
> >>>> Fallback:
> >>>> Retains the existing 8KB stack behavior
> >>> No, not to my understanding, see comments below.
> >>>
> >>>> Signed-off-by: Tony Rodriguez <unixpro1970@xxxxxxxxx>
> >>>> ---
> >>>> arch/sparc/include/asm/thread_info_64.h | 28 ++++++++++++-------------
> >>>> 1 file changed, 14 insertions(+), 14 deletions(-)
> >>>>
> >>>> diff --git a/arch/sparc/include/asm/thread_info_64.h b/arch/sparc/include/asm/thread_info_64.h
> >>>> index c8a73dff27f8..6b12a2b66385 100644
> >>>> --- a/arch/sparc/include/asm/thread_info_64.h
> >>>> +++ b/arch/sparc/include/asm/thread_info_64.h
> >>>> @@ -99,13 +99,20 @@ struct thread_info {
> >>>> #define FAULT_CODE_BLKCOMMIT 0x10 /* Use blk-commit ASI in copy_page */
> >>>> #define FAULT_CODE_BAD_RA 0x20 /* Bad RA for sun4v */
> >>>>
> >>>> -#if PAGE_SHIFT == 13
> >>>> -#define THREAD_SIZE (2*PAGE_SIZE)
> >>>> -#define THREAD_SHIFT (PAGE_SHIFT + 1)
> >>>> -#else /* PAGE_SHIFT == 13 */
> >>>> -#define THREAD_SIZE PAGE_SIZE
> >>>> -#define THREAD_SHIFT PAGE_SHIFT
> >>>> -#endif /* PAGE_SHIFT == 13 */
> >>>> +/* thread information allocation */
> >>>> +#ifdef CONFIG_SPARC64
> >>>> + #define THREAD_SIZE (4 * PAGE_SIZE)
> >>>> + #define THREAD_SHIFT (PAGE_SHIFT + 2)
> >>>> + #define THREAD_SIZE_ORDER 2
> >>> As far as I can see, given that this header is included by
> >>>
> >>> #if defined(__sparc__) && defined(__arch64__)
> >>> #include <asm/thread_info_64.h>
> >>> #else
> >>> #include <asm/thread_info_32.h>
> >>> #endif
> >>>
> >>> the code above is the only code that will ever be compiled, while leaving...
> >>>
> >>>> +#elif PAGE_SHIFT == 13
> >>>> + #define THREAD_SIZE (2 * PAGE_SIZE)
> >>>> + #define THREAD_SHIFT (PAGE_SHIFT + 1)
> >>>> + #define THREAD_SIZE_ORDER 1
> >>>> +#else
> >>>> + #define THREAD_SIZE PAGE_SIZE
> >>>> + #define THREAD_SHIFT PAGE_SHIFT
> >>>> + #define THREAD_SIZE_ORDER 0
> >>>> +#endif
> >>> ...this code dead, where the else branch code already was dead (but then
> >>> in two separate else braches).
> >>>
> >>> I'd rather see the else branch here and the else branch below cleaned up
> >>> by a separate patch with a fixup tag for commit 15b9350a177b ("sparc64:
> >>> Only support 4MB huge pages and 8KB base pages.") that as far as I can
> >>> see should have removed the else branch. The else branches was to use
> >>> only one page when the page size was _larger_ than 8 KiB when that was
> >>> an option.
> >> That whole logic is impenetrable.
> >> Why not set the 'desired thread size' in kB, then work out how many
> >> pages that ends up being based on the page size, and finally get the actual
> >> stack size.
> >> I'm not sure, but with vmalloc()ed stacks and 8k pages can't you have 24kB?
> > No, the next step up is 32 KiB as the stack allocation is sized by
> > THREAD_SIZE_ORDER.
> >
> > Cheers,
> > Andreas
> >
>
> After additional testing and debugging on a SPARC64 S7-2 system running
> kernel v7.1-mainline, I've made several important observations regarding
> the USB core stack overflow issue.
>
> 1. The Stack Overflow is Real and Consistent
>
> My initial patch (increasing kernel stack to 32KB) appears to work with
> v7.1-mainline as well. However, the underlying problem remains: the USB
> core's stack usage consistently exceeds the default 16KB limit during
> hub enumeration.
>
> 2. The "Static Analysis vs. Runtime Reality" Contradiction
>
> When I compile the kernel with -fstack-usage to generate .su files, the
> static analysis shows small stack frames for all USB core functions.
>
> For example:
>
> hub_event: 2457 bytes (static)
> hub_activate: 1892 bytes (static)
> usb_control_msg: 1248 bytes (static)
Those aren't that small.
The stack frame for a minimal function seems to be 176 bytes.
While there might be other places that allocate stack, most will be
allocated by the 'save %sp, -nnn, %sp' instruction that rotates the
register window (so the %sp it writes to is different from the one
it reads from).
Should be easy so find in the output of 'objdump -d vmlinux.o'.
(search for function_name.: to find the start of a function)
>
> However, my runtime stack tracing shows a dramatically different picture:
>
> STACKTRACE: hub_event():entry: 31856 bytes used
> STACKTRACE: hub_activate():entry: 31680 bytes used
> STACKTRACE: usb_control_msg():entry: 30768 bytes used
31856 - 31680 = 176
31680 - 30768 = 912
Those might match the code being run.
That makes it look like a lot of the problem is much earlier in the call stack.
David