Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix

From: Chris Metcalf
Date: Tue Sep 24 2013 - 15:27:38 EST


On 9/23/2013 4:57 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2013-09-23 at 13:59 -0400, Chris Metcalf wrote:
>> We just came up against this on tilegx with a customer bug report in a
>> PREEMPT environment. On tile, %tp is a GPR that points to the percpu area.
>> The following seems to be the right abstraction -- though I'd also argue
>> that letting barrier() clobber not just memory, but %tp, might be a better
>> solution, but it's not clear what the best way is to do per-architecture
>> overrides of per-compiler definitions like barrier(). See also the ARM v7
>> code, which has to do something similar, though their percpu pointer is
>> not a GPR, which changes the tradeoffs somewhat.
> Hrm, if I read correctly what you did is that you read "tp" into another
> register *and* also mark that action as clobbering the top int on the stack ?
>
> I don't quite get what the stack clobber brings you here and how it works
> around the fact that gcc might still cache that copy of tp into another
> register accross preempt_enable/disable...

Correct. The idea is that since gcc knows that the outputs of the asm are
dependent on memory, when it sees a memory clobber it knows it has to
instantiate the asm again to get a new value of the asm outputs. gcc will know
that any copies of tp in other registers are also dependent on memory and
will drop all of them across the barrier() as well.


> It's hard to tell with gcc ... the best I've had so far as an option was
> something that would mark my per-cpu register (r13) *itself* as clobbered...

Well, as I said above, that would be better, but it requires providing an
alternate definition of barrier() that is per-architecture, not just
per-compiler. If there's interest in pursuing a solution like that, it
would be technically somewhat better; in particular, with PREEMPT_NONE,
you could treat the "tp" register int as locally scoped in an inline, and
the compiler wouldn't have to reload it after function calls. Presumably
we'd need to pick an asm header that could provide an arch_barrier_clobbers
string that would be added to barrier() for gcc if it were defined.

--
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/