Re: [PATCH 21/21] x86/intel_rdt/mbm: Handle counter overflow
From: Thomas Gleixner
Date: Tue Jul 11 2017 - 11:22:47 EST
On Mon, 10 Jul 2017, Luck, Tony wrote:
> On Fri, Jul 07, 2017 at 08:50:40AM +0200, Thomas Gleixner wrote:
> > Aside of that, are you really serious about serializing the world and
> > everything on a single global mutex?
>
> It would be nice to not do that, but there are challenges. At
> any instant someone else might run:
>
> # rmdir /sys/fs/resctrl/{some_control_group}
>
> and blow away the control group and all the monitor groups under
> it.
>
> Someone else might do:
>
> # echo 0 > /sys/devices/system/cpu/cpu{N}/online
>
> where "N" is the last online cpu in a domain, which will
> blow away an rdt_domain structure and ask kernfs to remove
> some monitor files from every monitor directory.
>
>
> If we change how we handle rdt_domains to
>
> 1) Not delete them when last CPU goes away (and re-use them
> if they come back)
> 2) Have a safe way to search rdt_resource.domains for a domain
> that we know is there even though another may be in the middle
> of being added
>
> Then we could probably make:
>
> $ cat /sys/fs/restrl/ ... /llc_occupancy
>
> etc. not need to grab the mutex. We'd still need something
> to protect against a cross processor interrupt geting in the
> middle of the access to IA32_QM_EVTSEL/IA32_QM_CTR and for
> MBM counters to serialize access to mbm_state ... but it would
> be a lot finer granularity.
Thanks for the explanation. Yes, that would be nice, but we can start off
with the global mutex and think about the scalability issue after we got
the functionality itself under control.
Thanks,
tglx