To this last point, it might be more reasonable to map in a page that
contained a new structure with a stable ABI, which mirrored some of
the task_struct information, and likely other useful information as
needs are identified in the future. In any case, it would be hard
to beat a single memory read for performance.
Cache-coloring and kernel bookkeeping effects could be minimized if this
was provided as an mmaped page from a device driver, used only by
applications which care. This does work somewhat contrary to the idea of
getting support into glibc, unless glibc only used this capability when
asked to through some sort of environment variable or other run-time
configuration.
Well, if every process had a page of its own, what would the context
switch overhead be?
For process zero, for thread quite high on x86 because you
would need per CPU page tables. Doing that would be extremly
nasty because you would potentially need to allocate a new
set of page tables every time the process is scheduled to a new
CPU it hasn't run on before.
My reference was more to high suggestion of keeping a second version of task_struct for export. That would require changing everything
in task struct that is changed on switch_to and should be exported
in the other function too.