Re: hrtimer become inaccurate with RT patch

From: Thomas Gleixner
Date: Mon Jul 02 2018 - 16:13:51 EST

On Mon, 2 Jul 2018, John Stultz wrote:
> On Mon, Jul 2, 2018 at 2:34 AM, gengdongjiu <gengdongjiu@xxxxxxxxxx> wrote:
> > Hi Thomas/Anna/John,
> >
> > Recently I found that the hrtimer become inaccurate when there is a RT
> > process runs on the same cpu core, and the kernel has applied preempt_rt
> > patch.
> > The Linux kernel version is v4.1.46, and the preempt_rt patch is
> > patch-4.1.46-rt52.patch.
> > I know that in the preempt_rt environment the interrupt handlers no
> > longer run in interrupt context but in process context, so that RT
> > process will not be interrupt. But if the hrtimer is also runs in
> > process context the timer is useless when it's inaccurate. so I want to
> > consult you whether this is expected behavior? whether is reasonable to
> > move the timer IRQ handling to a thread?
> I've not looked at the PREEMPT_RT code in a long time, but years ago
> there was a tension in that there is not an easy way track ownership
> of timers. Thus timers all fired at the same priority of the hrtimer
> irq thread. This thread could be moved up or down in priority, but the
> problem was all timers would fire with the same priority. So either
> the thread priority was so high that low-priority process could
> generate a bunch of timers which would interrupt higher priority
> tasks, or the thread priority was lower, so a high priority task could
> block all timers.

Yeah, we had that long ago. It was complex and nasty.

> There was some handwavy talk of trying to keep per-process timer
> lists, so the hrtimer irq could still be in irq context but the firing
> logic it didn't do anything but mark its task as runnable and do the
> the actual timer firing logic before we eventually run the task (in
> proper rt priority order), in a fashion similar to signals. But I'm
> not sure if any attempts were made in that direction. I also think it
> was an open question if there's any logic in kernel that depend on
> strict in-order kernel timer processing, so its possible there could
> be odd inversion issues where high priority timer logic is waiting on
> /expecting lower priority timers to fire, etc, so its probably an area
> of research.

Well, the main issues are actually the signal based posix-timers. The
problem is not to keep track of them, them problem is which of the threads
to wake up for which timer. I've had experimental code which kinda worked,
but ran into issues with the signal masking and the horrors of sighand
lock. Definitely more research required for that. It might have become
simpler, but still sighand lock cannot be taken from hard interrupt context
on RT.