On Thursday 19 June 2008 16:59:50 Hidehiro Kawai wrote:What about?
When a process loads a kernel module, __stop_machine_run() is called, and
it calls sched_setscheduler() to give newly created kernel threads highest
priority. However, the process can have no CAP_SYS_NICE which required
for sched_setscheduler() to increase the priority. For example, SystemTap
loads its module with only CAP_SYS_MODULE. In this case,
sched_setscheduler() returns -EPERM, then BUG() is called.
Hi Hidehiro,
Nice catch. This can happen in the current code, it just doesn't
BUG().
Failure of sched_setscheduler() wouldn't be a real problem, so this
patch just ignores it.
Well, it can mean that the stop_machine blocks indefinitely. Better
than a BUG(), but we should aim higher.
Or, should we give the CAP_SYS_NICE capability temporarily?
I don't think so. It can be seen from another thread, and in theory
that should not see something random. Worse, they can change it from
another thread.
How's this?
sched_setscheduler: add a flag to control access checks
Hidehiro Kawai noticed that sched_setscheduler() can fail in
stop_machine: it calls sched_setscheduler() from insmod, which can
have CAP_SYS_MODULE without CAP_SYS_NICE.
This simply introduces a flag to allow us to disable the capability
checks for internal callers (this is simpler than splitting the
sched_setscheduler() function, since it loops checking permissions).