Re: [PATCH v5 3/3] perf x86: Exposing an Uncore unit to PMON for Intel Xeon server platform

From: Liang, Kan
Date: Wed Feb 12 2020 - 15:58:56 EST




On 2/12/2020 12:31 PM, Sudarikov, Roman wrote:
On 11.02.2020 23:14, Greg KH wrote:
On Tue, Feb 11, 2020 at 02:59:21PM -0500, Liang, Kan wrote:

On 2/11/2020 1:57 PM, Greg KH wrote:
On Tue, Feb 11, 2020 at 10:42:00AM -0800, Andi Kleen wrote:
On Tue, Feb 11, 2020 at 09:15:44AM -0800, Greg KH wrote:
On Tue, Feb 11, 2020 at 07:15:49PM +0300, roman.sudarikov@xxxxxxxxxxxxxxx wrote:
+static ssize_t skx_iio_mapping_show(struct device *dev,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct device_attribute *attr, char *buf)
+{
+ÂÂÂ struct pmu *pmu = dev_get_drvdata(dev);
+ÂÂÂ struct intel_uncore_pmu *uncore_pmu =
+ÂÂÂÂÂÂÂ container_of(pmu, struct intel_uncore_pmu, pmu);
+
+ÂÂÂ struct dev_ext_attribute *ea =
+ÂÂÂÂÂÂÂ container_of(attr, struct dev_ext_attribute, attr);
+ÂÂÂ long die = (long)ea->var;
+
+ÂÂÂ return sprintf(buf, "0000:%02x\n", skx_iio_stack(uncore_pmu, die));
If "0000:" is always the "prefix" of the output of this file, why have
it at all as you always know it is there?

I think Roman only test with BIOS configured as single-segment. So he
hard-code the segment# here.

I'm not sure if Roman can do some test with multiple-segment BIOS. If not, I
think we should at least print a warning here.

What is ever going to cause that to change?
I think it's just to make it a complete PCI address.
Is that what this really is? If so, it's not a "complete" pci address,
is it? If it is, use the real pci address please.
I think we don't need a complete PCI address here. The attr is to disclose
the mapping information between die and PCI BUS. Segment:BUS should be good
enough.
"good enough" for today, but note that you can not change the format of
the data in the file in the future, you would have to create a new file.
So I suggest at least try to future-proof it as much as possible if you
_know_ this could change.

Just use the full pci address, there's no reason not to, otherwise it's
just confusing.

thanks,

greg k-h
Hi Greg,

Yes, the "Segment:Bus" pair is enough to distinguish between different Root ports.

I think Greg suggests us to use full PCI address here.

Hi Greg,

There may be several devices are connected to IIO stack. There is no full PCI address for IIO stack.
I don't think we can list all of devices in the same IIO stack with full PCI address here either. It's not necessary, and only increase maintenance overhead.

I think we may have two options here.

Option 1: Roman's proposal.The format of the file is "Segment:Bus". For the future I can see, the format doesn't need to be changed.
E.g. $ls /sys/devices/uncore_<type>_<pmu_idx>/die0
$0000:7f

Option 2: Use full PCI address, but use -1 to indicate invalid address.
E.g. $ls /sys/devices/uncore_<type>_<pmu_idx>/die0
$0000:7f:-1:-1

Should we use the format in option 2?

Thanks,
Kan


Please see the changes below which are to address all previous comments.

Thanks,
Roman

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 96fca1ac22a4..f805fbdbbe81 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -3616,15 +3616,22 @@ skx_iio_mapping_visible(struct kobject *kobj, struct attribute *attr, int die)
Âstatic ssize_t skx_iio_mapping_show(struct device *dev,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct device_attribute *attr, char *buf)
Â{
+ÂÂÂ struct pci_bus *bus = NULL;
ÂÂÂÂ struct pmu *pmu = dev_get_drvdata(dev);
ÂÂÂÂ struct intel_uncore_pmu *uncore_pmu =
ÂÂÂÂÂÂÂÂ container_of(pmu, struct intel_uncore_pmu, pmu);
+ÂÂÂ int pmu_idx = uncore_pmu->pmu_idx;

ÂÂÂÂ struct dev_ext_attribute *ea =
ÂÂÂÂÂÂÂÂ container_of(attr, struct dev_ext_attribute, attr);
ÂÂÂÂ long die = (long)ea->var;

-ÂÂÂ return sprintf(buf, "0000:%02x\n", skx_iio_stack(uncore_pmu, die));
+ÂÂÂ do {
+ÂÂÂÂÂÂÂ bus = pci_find_next_bus(bus);
+ÂÂÂ } while (pmu_idx--);
+
+ÂÂÂ return sprintf(buf, "%04x:%02x\n", pci_domain_nr(bus),
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ skx_iio_stack(uncore_pmu, die));
Â}

Âstatic int skx_msr_cpu_bus_read(int cpu, u64 *topology)
@@ -3691,10 +3698,7 @@ static int skx_iio_get_topology(struct intel_uncore_type *type)
ÂÂÂÂ return 0;
Â}

-static struct attribute *uncore_empry_attr;
-
Âstatic struct attribute_group skx_iio_mapping_group = {
-ÂÂÂ .attrsÂÂÂÂÂÂÂ = &uncore_empry_attr,
ÂÂÂÂ .is_visibleÂÂÂ = skx_iio_mapping_visible,
Â};

@@ -3729,7 +3733,8 @@ static int skx_iio_set_mapping(struct intel_uncore_type *type)
ÂÂÂÂÂÂÂÂ return -ENOMEM;
ÂÂÂÂ }
ÂÂÂÂ for (die = 0; die < uncore_max_dies(); die++) {
-ÂÂÂÂÂÂÂ sprintf(buf, "node%ld", die);
+ÂÂÂÂÂÂÂ sprintf(buf, "die%ld", die);
+ÂÂÂÂÂÂÂ sysfs_attr_init(&eas[die].attr.attr);
ÂÂÂÂÂÂÂÂ eas[die].attr.attr.name = kstrdup(buf, GFP_KERNEL);
ÂÂÂÂÂÂÂÂ if (!eas[die].attr.attr.name) {
ÂÂÂÂÂÂÂÂÂÂÂÂ ret = -ENOMEM;
@@ -3752,6 +3757,7 @@ static int skx_iio_set_mapping(struct intel_uncore_type *type)
ÂÂÂÂ kfree(eas);
ÂÂÂÂ kfree(attrs);
ÂÂÂÂ kfree(type->topology);
+ÂÂÂ type->attr_update = NULL;

ÂÂÂÂ return ret;
Â}