Re: [PATCH RFC 0/6] Implement per-processor data areas for i386.
From: Jeremy Fitzhardinge
Date: Sun Aug 27 2006 - 12:43:48 EST
Arjan van de Ven wrote:
this will be interesting; x86-64 has a nice instruction to help with
this; 32 bit does not... so far conventional wisdom has been that
without the instruction it's not going to be worth it.
Hm, swapgs may be quick, but it isn't very easy to use since it doesn't
stack, and so requires careful handling for recursive kernel entries,
which involves extra tests and conditional jumps. I tried doing
something similar with my earlier patches, but it got all too messy.
Stacking %gs like the other registers turns out pretty cleanly.
When you're benchmarking this please use multiple CPU generations from
different vendors; I suspect this is one of those things that vary
greatly between models
Hm, it seems to me that unless the existing %ds/%es register
save/restores are a significant part of the existing cost of going
through entry.S, adding %gs to the set shouldn't make too much
difference. And I'm not sure about the relative cost of using a %gs
override vs. the normal current_task_info() masking, but I'm assuming
they're at worst equal, with the %gs override having a code-size advantage.
But yes, it definitely needs measurement.
J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/