[PATCH V6 0/6] Intel memory b/w monitoring support
From: Vikas Shivappa
Date: Thu Mar 10 2016 - 18:34:00 EST
The patch series has two preparatory patch for cqm and then 4 MBM
patches. Patches are based on tip perf/core.
Thanks to Thomas and PeterZ for feedback on V5 and have tried to
implement feedback in this version.
Memory bandwitdh monitoring(MBM) provides OS/VMM a way to monitor
bandwidth from one level of cache to another. The current patches
support L3 external bandwitch monitoring.
It supports both 'local bandwidth' and 'total bandwidth' monitoring for
the socket. Local bandwidth measures the amount of data sent through
the memory controller on the socket and total b/w measures the total
system bandwidth.
The tasks are associated with a Resouce Monitoring ID(RMID) just like in
cqm and OS uses a MSR write to indicate the RMID of the task during
scheduling.
Memory bandwitdh monitoring(MBM) provides OS/VMM a way to monitor
bandwidth from one level of cache to another. The current patches
support L3 external bandwitch monitoring.
It supports both 'local bandwidth' and 'total bandwidth' monitoring for
the socket. Local bandwidth measures the amount of data sent through
the memory controller on the socket and total b/w measures the total
system bandwidth.
Extending the cache quality of service monitoring(CQM) we add two more
events to the perf infrastructure.
intel_cqm_llc/local_bytes - bytes sent through local socket memory controller
intel_cqm_llc/total_bytes - total L3 external bytes sent
The tasks are associated with a Resouce Monitoring ID(RMID) just like in
cqm and OS uses a MSR write to indicate the RMID of the task during
scheduling.
changes in V6:
following changes made as per the feedback.
- Fixed the cleanup code for cqm and mbm and seperated the cleaning for
them.
- Fixed a few changelogs.
- removed bw related events and related code as the total bytes can just
be used to measure the b/w.
- Fixed some of the init code and changed the overflow handling counting
code to follow the perf conventions.
- Made changes to be consistent with use of enum vs. #defines
Changes in V5:
As per Thomas feedback made the below changes:
- Fixed the memory leak and notifier leak in cqm init and also made it a
separate patch
- Changed mbm patch to using topology_max_packages to count the max
packages rather than online packages.
- Removed the unnecessary out: label and goto in the 0003 .
- Fixed the restarting of timer when the event list is empty.
- Also Fixed the incorrect usage of mutex in timer context.
Changes in v4:
The V4 version of MBM is almost a complete rewrite of the prior
versions. It has seemed the best way to address all of Thomas earlier
comments.
[PATCH 1/6] x86/perf/intel/cqm: Fix cqm handling of grouping events
[PATCH 2/6] x86/perf/intel/cqm: Fix cqm memory leak and notifier leak
[PATCH 3/6] x86/mbm: Intel Memory B/W Monitoring enumeration and init
[PATCH 4/6] x86/mbm: Memory bandwidth monitoring event management
[PATCH 5/6] x86/mbm: RMID Recycling MBM changes
[PATCH 6/6] x86/mbm: Add support for MBM counter overflow handling