Re: [PATCH] ipmi_si_intf: use usleep_range() instead of busy looping

From: Corey Minyard
Date: Thu Aug 01 2019 - 13:40:09 EST


On Wed, Jul 10, 2019 at 07:22:21AM -0700, Tejun Heo wrote:
> Hello,
>
> On Tue, Jul 09, 2019 at 06:01:44PM -0500, Corey Minyard wrote:
> > > I'm really not sure "carefully tuned" is applicable on indefinite busy
> > > looping.
> >
> > Well, yeah, but other things were tried and this was the only thing
> > we could find that worked. That was before the kind of SMP stuff
> > we have now, though.
>
> I see.
>
> > > We can go for shorter timeouts for sure but I don't think this sort of
> > > busy looping is acceptable. Is your position that this must be a busy
> > > loop?
> >
> > Well, no. I want something that provides as high a throughput as
> > possible and doesn't cause scheduling issues. But that may not be
> > possible. Screwing up the scheduler is a lot worse than slow IPMI
> > firmware updates.
> >
> > How short can the timeouts be and avoid issues?
>
> We first tried msleep(1) and that was too slow even for sensor reading
> making it take longer than 50s. With the 100us-200us sleep, it got
> down to ~5s which was good enough for our use case and the cpu /
> scheduler impact was still mostly negligible. I can't tell for sure
> without testing but going significantly below 100us is likely to
> become visible pretty quickly.

I spent some time looking at this. Without the patch, I
measured around 3.5ms to send/receive a get device ID message
and uses 100% of the CPU on a core.

I measured your patch, it slowed it down to around 10.5ms
per message, which is not good. Though it did just use a
few percent of the core.

I wrote some code to auto-adjust the timer. It settled on
a delay around 35us, which gave 4.7ms per message, which is
probably acceptable, and used around 40% of the CPU. If
I use any timeout (even a 0-10us range) it won't go below
4ms per message.

The process is running at nice 19 priority, so it won't
have a significant effect on other processes from a priority
point of view. It may still be hitting the scheduler locks
pretty hard, though. But I played with things quite a bit
and the behavior or the management controller is too
erratic to set a really good timeout. Maybe other ones
are better, don't know.

One other option we have is that the driver has something
called "maintenance mode". If it detect that you have
reset the management controller or are issuing firmware
commands, it will modify timeout behavior. It can also
be activated manually. I could also make it switch to
just calling schedule instead of delaying when in that
mode.

The right thing it do is complain bitterly to vendors who
build hardware that has to be polled. But besides that,
I'm thinking the maintenance mode is the thing to do.
It will also change behavior if you reset the management
controller, but only for 30 seconds or so. Does that
work?

-corey

>
> We can also take a hybrid approach where we busy poll w/ 1us udelay
> upto, say, fifty times and then switch to sleeping poll.
>
> Are there some tests which can be used to verify the cases which may
> get impacted by these changes?
>
> Thanks.
>
> --
> tejun