Re: [PATCH 3/3] platform/x86: Intel PMT Crashlog capability driver

From: Alexander Duyck
Date: Mon Sep 21 2020 - 13:34:06 EST


On Mon, Sep 21, 2020 at 9:07 AM Alexey Budankov
<alexey.budankov@xxxxxxxxxxxxxxx> wrote:
>
>
> On 21.09.2020 16:36, Alexander Duyck wrote:
> > On Sat, Sep 19, 2020 at 1:01 AM Alexey Budankov
> > <alexey.budankov@xxxxxxxxxxxxxxx> wrote:
> >>
> >> Hi,
> >>
> >> Thanks for the patches.
> >>
> >> On 11.09.2020 22:45, David E. Box wrote:
> >>> From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
> >>>
> >>> Add support for the Intel Platform Monitoring Technology crashlog
> >>> interface. This interface provides a few sysfs values to allow for
> >>> controlling the crashlog telemetry interface as well as a character driver
> >>> to allow for mapping the crashlog memory region so that it can be accessed
> >>> after a crashlog has been recorded.
> >>>
> >>> This driver is meant to only support the server version of the crashlog
> >>> which is identified as crash_type 1 with a version of zero. Currently no
> >>> other types are supported.
> >>>
> >>> Signed-off-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
> >>> Signed-off-by: David E. Box <david.e.box@xxxxxxxxxxxxxxx>
> >>> ---
> >>> .../ABI/testing/sysfs-class-pmt_crashlog | 66 ++
> >>> drivers/platform/x86/Kconfig | 10 +
> >>> drivers/platform/x86/Makefile | 1 +
> >>> drivers/platform/x86/intel_pmt_crashlog.c | 588 ++++++++++++++++++
> >>> 4 files changed, 665 insertions(+)
> >>> create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_crashlog
> >>> create mode 100644 drivers/platform/x86/intel_pmt_crashlog.c
> >>
> >> <SNIP>
> >>
> >>> +
> >>> +/*
> >>> + * devfs
> >>> + */
> >>> +static int pmt_crashlog_open(struct inode *inode, struct file *filp)
> >>> +{
> >>> + struct crashlog_entry *entry;
> >>> + struct pci_driver *pci_drv;
> >>> + struct pmt_crashlog_priv *priv;
> >>> +
> >>> + if (!capable(CAP_SYS_ADMIN))
> >>> + return -EPERM;
> >>
> >> Will not this above still block access to /dev/crashlogX for admin_group users
> >> in case root configured access e.g. similar to this:
> >>
> >> ls -alh /dev/
> >> crw-rw----. 1 root admin_group 1, 9 Sep 15 18:28 crashlogX
> >>
> >> If yes then that capable() check is probably superfluous and
> >> should be avoided in order not to block access to PMT data.
> >>
> >> Could you please clarify or comment?
> >>
> >> Thanks,
> >> Alexei
> >
> > Actually this should probably be updated to "if (!perfmon_capable())"
> > instead. The telemetry driver code originally had the CAP_SYS_ADMIN
> > check and it probably makes more sense to limit this user-wise to the
> > same users who have access to performon.
>
> Indeed, it is currently perfmon_capable() for performance part but it is unclear
> if it should be the same for crashlog since it's more like a debugging thing.
> It appears it all depends on usage models implemented in a user space tools e.g. Perf.
>
> However there is an important use case that is not covered
> neither by perfmon_capable() nor by capable(CAP_SYS_ADMIN).
>
> It is access and usage of PMT features in cluster or cloud environments by
> unprivileged users that don't have root credentials. The users however can run
> software tools (Perf, VTune etc.) once installed and configured by root.
>
> Even though Perf tool can be configured to use use CAP_PERFMON [1] the tool binary
> should still reside on a file system supporting xattr to convey capabilities
> into processes implementing monitoring.
>
> Unfortunately NFSv3 which is quite popular to be used for storing and sharing
> software tooling in large production systems doesn't support capabilities yet.
>
> Thus, capabilities approach still has limitation in HPC clusters and cloud environments
> and for PMT support this limitation has a chance to be lifted if
> suitable access control mechanism would be designed from the very beggining.
>
> Actually I tried to change group ownership of /dev and /sys directories and files, being root,
> and it appeared that for dev file it is possible:
> ls -alh /dev/
> crw-rw----. 1 root admin_group 1, 9 Sep 15 18:28 telem<X>
>
> So if e.g. perf tool having CAP_PERFMON and configured like:
>
> -rwxr-x---. 1 root admin_group 24M Mar 5 2020 perf.cap
>
> would mmap /dev/telem<X> to provide uncore performance insights
> to admin_group users only access control based on user/group/others ownership
> would suffice without capabilities requirement.
>
> Still haven't had chance to verify it for memory mapped PMT dev files and
> that is why I am asking you guys here.
>
> Alexei
>
> [1] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html#privileged-perf-users-groups

We will have to see. There is a high likelihood this code will go away
if we switch over to binary sysfs attributes for the data. I'm still
working on the rewrite and hope to have something we can review as an
RFC in the next few days.

Thanks.

- Alex