Re: [patch V2 26/50] signal: Get rid of resched_timer logic

From: Andrei Vagin
Date: Thu Apr 25 2024 - 03:22:54 EST


On Tue, Apr 23, 2024 at 6:48 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> On Tue, Apr 23 2024 at 23:18, Thomas Gleixner wrote:
> > On Fri, Apr 19 2024 at 13:06, Oleg Nesterov wrote:
> >> Today SI_TIMER < 0. We could introduce SI_TIMER_KERNEL > 0 with the minimal
> >> changes, but this can't help because the commit 66dd34ad31e59 allows to send
> >> any siginfo to itself.
>
> Well that predates the __SI_CODE() removal. So I doubt it's required
> today, but what do I know about CRIU.

CRIU needs to restore siginfo-s of pending signals so that a task sees
the same siginfo in a signal handler as it would be without
checkpoint/restore. CRIU will not be affected, if rt_sigqueueinfo denies
any non-negative si_code that is never reported by the kernel.

> >> Otoh, I have no idea how CRIU restores the posix timers. If a process has
> >> a pending blocked SI_TIMER signal, then I guess it actually needs to enqueue
> >> this signal at restore time, but resched_timer will be never true?
> >
> > It can't restore the correct sys_si_private value because that is
> > nowhere exposed to user space.

We are open to ideas how it can be restored properly.

>
> It is exposed via PTRACE_PEEKSIGINFO, but it's useless.

When PTRACE_PEEKSIGINFO was added, it didn't expose it. Then it was
changed by cc731525f26a ("signal: Remove kernel internal si_code magic").

The idea of PTRACE_PEEKSIGINFO is to get a siginfo that a process would
see in a signal handler.

>
> > There is no special treatment for SI_TIMER, so the signal restore might
> > just end up queueing a random extra SI_TIMER signal if there was one
> > pending.
>
> It does. The sys_si_private value is not going to match the timer side
> value and obviously the missing prealloc flag prevents it from trying to
> rearm the timer.
>
> > I checked the CRIU source and it looks like this just "works" by
> > reconstructing and rearming the timer with the last expiry value. As
> > that is in the past it will fire immediately and queue the signal.
>
> It's not necessarily in the past, but it will fire eventually and in the
> case of a blocked signal there will be two SI_TIMER signals queued.
>
> So the patch is not completely wrong except that there is nothing which
> prevents setting sys_si_private via rt_sigqueueinfo(), but that's
> obviously a solvable problem. With that solved the condition:
>
> *resched_timer =
> (first->flags & SIGQUEUE_PREALLOC) &&
> (info->si_code == SI_TIMER) &&
> (info->si_sys_private);
>
> really can be reduced to:
>
> info->code == SI_TIMER && info->si_sys_private
>
> In fact it makes a lot of sense _not_ to allow user space to set
> info->si_sys_private because that's a kernel internal value and should
> never be exposed to user space in the first place.

We can zero out all of them in rt_sigqueueinfo.

Thanks,
Andrei