Re: [PATCH 3/3] tpm: make locality request return value consistent

From: Alexander Steffen
Date: Tue Feb 20 2024 - 13:59:41 EST


On 19.02.2024 21:29, Daniel P. Smith wrote:
On 2/1/24 17:49, Jarkko Sakkinen wrote:
On Wed Jan 31, 2024 at 7:08 PM EET, Daniel P. Smith wrote:
The function tpm_tis_request_locality() is expected to return the locality
value that was requested, or a negative error code upon failure. If it is called
while locality_count of struct tis_data is non-zero, no actual locality request
will be sent. Because the ret variable is initially set to 0, the
locality_count will still get increased, and the function will return 0. For a
caller, this would indicate that locality 0 was successfully requested and not
the state changes just mentioned.

Additionally, the function __tpm_tis_request_locality() provides inconsistent
error codes. It will provide either a failed IO write or a -1 should it have
timed out waiting for locality request to succeed.

This commit changes __tpm_tis_request_locality() to return valid negative error
codes to reflect the reason it fails. It then adjusts the return value check in
tpm_tis_request_locality() to check for a non-negative return value before
incrementing locality_cout. In addition, the initial value of the ret value is
set to a negative error to ensure the check does not pass if
__tpm_tis_request_locality() is not called.

This is way way too abtract explanation and since I don't honestly
understand what I'm reading, the code changes look bunch of arbitrary
changes with no sound logic as a whole.

In more simpler terms, the interface is inconsistent with its return
values. To be specific, here are the sources for the possible values
tpm_tis_request_locality() will return:
1. 0 - 4: _tpm_tis_request_locality() was able to set the locality
2. 0: a locality already open, no locality request made
3. -1: if timeout happens in __tpm_tis_request_locality()
4. -EINVAL: unlikely, return by IO write for incorrect sized write

As can easily be seen, tpm_tis_request_locality() will return 0 for both
a successful(1) and non-successful request(2). And to be explicit for
(2), if tpm_tis_request_locality is called for a non-zero locality and
the locality counter is not zero, it will return 0. Thus, making the
value 0 reflect as success when locality 0 is successfully requested and
as failure when a locality is requested with a locality already open.

There is a potential problem here, but I think it is slightly different from what you describe: Currently, the kernel uses only locality 0, so case 1 and 2 are indistinguishable for the caller. Getting a return value of 0 simply means that the requested locality is now active. The callers don't care whether it had already been active before or not, so it is not a problem that the callers cannot distinguish case 1 and 2, and a return value of 0 always indicates "success".

It might only become a problem once you make the kernel use localities != 0. Then a caller can get either 0 as the return value (if the locality was already active before) or the requested locality, and both values mean "success". In practice, this shouldn't cause any problems as far as I can tell, because all existing callers either check only for failures (negative return values), e.g. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/char/tpm/tpm_tis_core.c?h=v6.8-rc5#n1214, or explicitly request locality 0 and check for a return value of 0, e.g. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/char/tpm/tpm_tis_core.c?h=v6.8-rc5#n750. There is no caller that would be confused by case 2 because it requests an arbitrary locality and always expects that locality to be returned in order to indiciate "success".

Still, such an inconsistency is not nice and should be fixed, but if I read your patch correctly, this is not what it does: In tpm_tis_request_locality(), you initialize ret with -EBUSY. For locality_count != 0, you never assign to ret again and therefore return -EBUSY, even though the locality is active and can be used. The correct fix would be to initialize ret with l, so that no error is returned in such cases.

As for failures, correct me if I am wrong, but if a function is
returning negative error codes, it should not be using a hard coded -1
as a generic error code. As I note, it is unlikely for the -EINVAL to be
delivered, but the code path is still available should something in the
future change the backing call logic.

After this change, the possible return values for
tpm_tis_request_locality() become:
1. 0 - 4: the locality that was successfully requested
2. -EBUSY: tpm busy, unable to request locality
3. -EINVAL: invalid parameter

With this more consistent interface, I updated the return value checks
at the call sites to check for negative error as the means to catch
failures.

v/r,
dps