RE: [RFC 00/14] Dynamic Kernel Stacks

From: David Laight
Date: Mon Mar 18 2024 - 11:53:50 EST


From: Pasha Tatashin
> Sent: 18 March 2024 15:31
>
> On Mon, Mar 18, 2024 at 11:19 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > On Mon, Mar 18, 2024 at 11:09:47AM -0400, Pasha Tatashin wrote:
> > > The TLB load is going to be exactly the same as today, we already use
> > > small pages for VMA mapped stacks. We won't need to have extra
> > > flushing either, the mappings are in the kernel space, and once pages
> > > are removed from the page table, no one is going to access that VA
> > > space until that thread enters the kernel again. We will need to
> > > invalidate the VA range only when the pages are mapped, and only on
> > > the local cpu.
> >
> > No; we can pass pointers to our kernel stack to other threads. The
> > obvious one is a mutex; we put a mutex_waiter on our own stack and
> > add its list_head to the mutex's waiter list. I'm sure you can
> > think of many other places we do this (eg wait queues, poll(), select(),
> > etc).
>
> Hm, it means that stack is sleeping in the kernel space, and has its
> stack pages mapped and invalidated on the local CPU, but access from
> the remote CPU to that stack pages would be problematic.
>
> I think we still won't need IPI, but VA-range invalidation is actually
> needed on unmaps, and should happen during context switch so every
> time we go off-cpu. Therefore, what Brian/Andy have suggested makes
> more sense instead of kernel/enter/exit paths.

I think you'll need to broadcast an invalidate.
Consider:
CPU A: task allocates extra pages and adds something to some list.
CPU B: accesses that data and maybe modifies it.
Some page-table walk setup ut the TLB.
CPU A: task detects the modify, removes the item from the list,
collapses back the stack and sleeps.
Stack pages freed.
CPU A: task wakes up (on the same cpu for simplicity).
Goes down a deep stack and puts an item on a list.
Different physical pages are allocated.
CPU B: accesses the associated KVA.
It better not have a cached TLB.

Doesn't that need an IPI?

Freeing the pages is much harder than allocating them.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)