Re: [PATCH v4 3/6] libnvdimm, acpi, nfit: Add bus level dsm mask for pass thru.

From: Dan Williams
Date: Wed Jul 05 2017 - 12:36:20 EST


On Wed, Jul 5, 2017 at 9:24 AM, Jerry Hoemann <jerry.hoemann@xxxxxxx> wrote:
> On Tue, Jul 04, 2017 at 01:37:43PM -0700, Dan Williams wrote:
>> On Tue, Jul 4, 2017 at 1:08 PM, Jerry Hoemann <jerry.hoemann@xxxxxxx> wrote:
>> > On Sat, Jul 01, 2017 at 01:46:03PM -0700, Dan Williams wrote:
>> >> On Sat, Jul 1, 2017 at 1:38 PM, Jerry Hoemann <jerry.hoemann@xxxxxxx> wrote:
>> >> > On Sat, Jul 01, 2017 at 01:10:31PM -0700, Dan Williams wrote:
>> >> >> On Sat, Jul 1, 2017 at 1:08 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
>> >> >> > On Sat, Jul 1, 2017 at 12:58 PM, Jerry Hoemann <jerry.hoemann@xxxxxxx> wrote:
>> >> >> >> On Fri, Jun 30, 2017 at 08:55:22PM -0700, Dan Williams wrote:
>> >> >> >>
>> >> >> >> ...
>
> ...
>
>> >> >> >>>
>> >> >> >>> This drops function number 0 which userspace has no need to call.
>> >> >> >>
>> >> >> >> Actually I like to call function 0. Its an excellent test when
>> >> >> >> modifying the code path as its a no side effects function whose output
>> >> >> >> is known in advance and instantly recognizable. I also use it when
>> >> >> >> testing new firmware.
>> >> >> >>
>> >> >> >> What is the downside to allowing it? What bad things happen?
>> >> >> >
>> >> >> > It allows implementations to bypass the standardization process and
>> >> >> > ship new root DSMs. It's always possible to patch the kernel locally
>> >> >> > for development, so I see no reason to ship this capability globally.
>> >> >
>> >> > I don't understand this comment, but I think your next comment
>> >> > essentially says to disregard this comment?
>> >>
>> >> Yes, sorry.
>> >>
>> >> >> Actually, just the discovery portion does not lead to this leak, but
>> >> >> it's redundant when we have the 'dsm_mask' sysfs attribute.
>> >> >
>> >> > No. The generation of the mask in sysfs is not done by
>> >> > executing the code in acpi_nfit_ctl. One of the reasons I call
>> >> > function 0 to test changes I am making to the ioctl path itself.
>> >> > The sysfs has nothing to do with that path and cannot be used
>> >> > to serve this purpose.
>> >> >
>> >> > And since the content of sysfs has been edited it also can not be
>> >> > used as a basic test of firmware.
>> >> >
>> >> > What is the downside to allowing the calling of function 0?
>> >>
>> >> It needlessly expands the kernel ABI. I would suggest, if you want to
>> >
>> > No. It is not needless. It is not an ABI extension.
>> > Same goes for the override feature.
>>
>> If the need is testing then we have a tools/testing/nvdimm for that.
>
>
>
>> Of course it's an ABI extension, it allows userspace to discover DSM
>> function numbers the kernel didn't know about at compile time.
>
>
> A modification to a library or kernel that changes the results of a
> function (or system call) doesn't necessarily break (or extend) an ABI.
> An obvious example is that of a random number generator function.
> A library/kernel is completely free to change the implementation
> of the random number generator (and the values it returns)
> without breaking the ABI provided all other rules of ABI preservation
> are followed.
>
> Now lets look at problem at hand. The pass thru mechanism has very
> little semantic overhead. Fill in the nd_cmd_pkg as described in ndctl.h,
> call the ioctl w/ argument with ND_CMD_CALL, and the kernel will marshal
> up the arguments, call the DSM and return the results. The values
> of nd_command could be any value and it is for the DSM to either accept
> or reject the input argument. I wrote this interface and this is how
> I defined it.
>
> The user application is not changing irrespective of if the kernel applies
> a mask to the passed in nd_command argument. The data structures are not
> changing at either source level or binary level. The calling convention is not
> changing. No object file changes are required. Nothing related to ABI
> preservation is impacted. The only question is whether the application
> of a mask to special case function 0 breaks/extends the ABI.
>
> It turns out that this point doesn't really matter as your position
> is invalid either way.
>
> The argument for this not being an API breakage/extension:
>
> A DSM could either implement or not a function index for any value of N.
> So, a correctly written application must take into account that for
> any value of N, the DSM may return error or not. Preserving an ABI
> doesn't require the library/kernel preserve incorrect application
> behavior.
>
> Now, assume that the special casing of function zero does constitute
> a breakage/extension of the ABI:
>
> I'm not the one wishing to special case function 0, you are.
> So, to this point I say, Dan please don't make needless extension to
> the ABI. Its and extension and you've provided no valid reason
> for making it.
>
> Your argument to disallow function zero is invalid.
>
> There is nothing harmful per se to allow function 0. All DSMs that return
> non zero are required to have it. By excluding it, you actually create the
> impression that the underlying DSM is violating the DSM specification.

This goes back to the original reasoning for pushing back on the
override for the leaf-level _DSM methods. Specifically the ability to
bypass the standardization process to ship vendor-specific behavior.
Now, the other side of the argument is that if the next spec adds new
_DSMs a simple override can enable them. I am more sympathetic to the
override for the leaf / DIMM level because those _DSMs truly are
DIMM-vendor specific, but the root device is not. Also, none of the
root-level DSMs added for 6.2 are in any way critical for proper
operation of the platform, and I do not see any bus-level
functionality on the horizon that we need to aggressively pre-enable.
It was a mistake to use _DSM for common root-level functionality, and
we shouldn't double down on that mistake by allowing unfettered
definition of new interfaces. NVDIMM is not so special that it needs
to bypass the standard ACPI-to-kernel development pipeline.