Re: [PATCH] x86/mm/cpa: avoid wbinvd() for PREEMPT_RT_FULL

From: Sebastian Andrzej Siewior
Date: Wed Jan 25 2017 - 12:07:27 EST


On 2017-01-25 17:24:31 [+0100], Peter Zijlstra wrote:
> On Wed, Jan 25, 2017 at 05:12:17PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2017-01-21 15:19:15 [+0100], Peter Zijlstra wrote:
> > > On Fri, Jan 20, 2017 at 11:42:57PM +0100, John Ogness wrote:
> > > > Although wbinvd() is faster than flushing many individual pages, it
> > > > blocks the memory bus for "long" periods of time (>100us), thus
> > > > directly causing unusually large latencies for PREEMPT_RT_FULL. For
> > > > 1024 pages, flushing those pages individually can take up to 2200us,
> > > > but the task remains fully preemptible during that time.
> > > >
> > > > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > > > Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> > > > Signed-off-by: John Ogness <john.ogness@xxxxxxxxxxxxx>
> > > > ---
> > > > arch/x86/mm/pageattr.c | 5 +++++
> > > > 1 file changed, 5 insertions(+)
> > > >
> > > > diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> > > > index e3353c9..a182477 100644
> > > > --- a/arch/x86/mm/pageattr.c
> > > > +++ b/arch/x86/mm/pageattr.c
> > > > @@ -214,7 +214,12 @@ static void cpa_flush_array(unsigned long *start, int numpages, int cache,
> > > > int in_flags, struct page **pages)
> > > > {
> > > > unsigned int i, level;
> > > > +#ifdef CONFIG_PREEMPT_RT_FULL
> > > > + /* wbinvd() causes ugly latencies, avoid it */
> > > > + unsigned long do_wbinvd = 0;
> > >
> > > Arguably we should do the same for CONFIG_PREEMPT and possibly even
> > > always, esp. when considering CAT.
> >
> > So you want to see this patch again with CONFIG_PREEMPT instead of
> > CONFIG_PREEMPT_RT_FULL and also targeting lkml?
>
> Patches should go to LKML irrespective of PREEMPT_RT_FULL or not.
>
> But yes, I think it makes sense for PREEMPT as well.
>
> > I don't get quite the link between wbindv and CAT (assuming it stands
> > for Intel's Cache Allocation support). But if you want unconditionally
> > want to drop that wbinvd because it is bad for another !RT usecase, fine
> > by me :)
>
> Yes, that CAT.
>
> I suspect; but I couldn't find the SDM making any statement what so ever
> on this; that when you do wbinvd you kill the _entire_ cache,
> irrespective of our CAT mask.
>
> This means that a random task doing wbinvd(), even if it is 'isolated'
> with a CAT mask, will affect the performance of all other tasks on that
> cache domain.
>
> IOW, your X11 running in some 'system' group will affect your RT
> workload on the isolated CPU by killing its cache.

Based on this I would say it makes sense to get rid of wbinvd. If wbinvd
would respect the CAT mask then it would be noted in the SDM.

Sebastian