split up lockdep and syscall related functionailty in generic entry code
From: Sven Schnelle
Date: Tue Dec 01 2020 - 03:36:49 EST
i'm currently working on converting s390 to use the generic entry
functionality. So far things are straigt-forward, there's only one
slight problem. There is a syscall_enter_from_user_mode() which sets
lockdep state and other initial stuff + does the entry work at the same
time. This is a problem on s390 because the way we restart syscalls isn't
as easy as on x86.
My understanding on x86 is that syscalls are restarted there by just rewinding
the program counter and return to user space, so the instruction causing
the syscall gets executed again.
On s390 this doesn't work, because the syscall number might be hard coded
into the 'svc' instruction, so when the syscall number has to be changed we
would repeat the wrong (old) syscall.
So we would need functions that only do the stuff that is required when switching
from user space to kernel and back, and functions which do the system call tracing
and work which might be called repeatedly.
With the attached patch, the s390 code now looks like this:
(i removed some s390 specific stuff here to make the function easier
to read)
__do_syscall is the function which gets called by low level entry.S code:
void noinstr __do_syscall(struct pt_regs *regs)
{
enter_from_user_mode(regs); /* sets lockdep state, and other initial stuff */
/*
* functions that need to run with irqs disabled,
* but lockdep state and other stuff set up
*/
memcpy(®s->gprs[8], S390_lowcore.save_area_sync, 8 * sizeof(unsigned long));
memcpy(®s->int_code, &S390_lowcore.svc_ilc, sizeof(regs->int_code));
regs->psw = S390_lowcore.svc_old_psw;
update_timer_sys();
local_irq_enable();
regs->orig_gpr2 = regs->gprs[2];
do {
regs->flags = _PIF_SYSCALL;
do_syscall(regs);
} while (test_pt_regs_flag(regs, PIF_SYSCALL_RESTART));
exit_to_user_mode();
}
__do_syscall calls do_syscall which does all the syscall work, and this might
be called more than once if PIF_SYSCALL_RESTART is set:
void do_syscall(struct pt_regs *regs)
{
unsigned long nr = regs->int_code & 0xffff;
nr = syscall_enter_from_user_mode_work(regs, nr);
regs->gprs[2] = -ENOSYS;
if (likely(nr < NR_syscalls)) {
regs->gprs[2] = current->thread.sys_call_table[nr](
regs->orig_gpr2, regs->gprs[3],
regs->gprs[4], regs->gprs[5],
regs->gprs[6], regs->gprs[7]);
}
syscall_exit_to_user_mode1(regs);
}
What do you think about the attach patch? I'm also open for a proper name
for syscall_exit_to_user_mode1() ;-)