RE: [RFC PATCH v2 0/5] Add hardware prefetch driver for A64FX and Intel processors

From: tarumizu.kohei@xxxxxxxxxxx
Date: Thu Nov 18 2021 - 01:22:21 EST


I'm sorry for the late reply.

> So put all those justifications at the beginning of your 0th message
> when you send a patchset so that it is clear to reviewers *why* you're
> doing this. The "why" is the most important - everything else comes
> after.

I understand. The next time we send a patchset, put these
justifications at the beginning of our 0th message.

> Well, how many prefetcher drivers will be there?
>
> On x86 there will be one per vendor, so 2-3 the most…

Currentry, we plan to support only two drivers for Intel and A64FX.
Even if we support other vendors, it will probably increase only a
little.

>> We don't think this is a good way. If there is any other suitable
>> way, we would like to change it.

This means that our way is not good. Therefore, we would like to
reconsider the file structure along with changes in the interface
specification.

> Also, as dhansen points out, we have already
>
> /sys/devices/system/cpu/cpu*/cache
>
> so all those knobs belong there on x86.

Intel MSR and A64FX have hardware prefetcher that affect L1d cache and
L2 cache. Does it suit your intention to create a prefetcher directory
under the cache directory as below?

/sys/devices/system/cpu/cpu*/cache/
index0/prefetcher/enable
index2/prefetcher/enable

The above example presumes that the L1d cache is at index0 (level: 1,
type: Data) and the L2 cache is at index2 (level:2, type: Unified).

> Also, I think that shoehorning all these different cache architectures
> and different prefetcher knobs which are available from each CPU, into a
> common sysfs hierarchy is going to cause a lot of ugly ifdeffery if not
> done right.
>
> Some caches will have control A while others won't - they will have
> control B so people will wonder why control A works on box B_a but not
> on box B_b...
>
> So we have to be very careful what we expose to userspace because it
> becomes an ABI which we have to support for an indefinite time.

To avoid shoehorning different prefetchers in a common sysfs hierarchy,
we would like to represent these to different hierarchy.

Intel MSR has three type of prefetchers, and we represent "Hardware
Prefethcer" as "hwpf", "Adjacent Cache Line Prefetcher" as "aclpf",
and "IP Prefetcher" as "ippf". These prefetcher have one controllable
parameter "disable".

A64FX has one type of prefetcher, and we represent it as "hwpf". This
prefetcher has three parameter "disable", "dist" and "strong".

The following table shows which caches are affected by the combination
of prefetcher and parameter.

| Cache affected | Combination ([prefecher]/[parameter]) |
|----------------|---------------------------------------|
| Intel MSR L1d | hwpf/disable, ippf/disable |
| Intel MSR L2 | hwpf/disable, aclpf/disable |
| A64FX L1d | hwpf/disable, hwpf/dist, hwpf/strong |
| A64FX L2 | hwpf/disable, hwpf/dist, hwpf/strong |

Does it make sense to create sysfs directories as below?

* For Intel MSR
/.../index0/prefetcher/hwpf/enable
/.../index0/prefetcher/ippf/enable
/.../index2/prefetcher/hwpf/enable
/.../index2/prefetcher/aclpf/enable

* For A64FX
/.../index[0,2]/prefetcher/hwpf/enable
/.../index[0,2]/prefetcher/hwpf/dist
/.../index[0,2]/prefetcher/hwpf/strong

> Also, if you're going to give the xmrig example, then we should involve
> the xmrig people and ask them whether the stuff you're exposing to
> userspace is good for their use case.

We would like to ask them when the interface specification is fixed to
some extent.