Re: [PATCH] block: Fix validation of ioprio level

From: Damien Le Moal
Date: Mon Sep 09 2024 - 06:20:43 EST


On 9/9/24 17:59, Ajay Kaher wrote:
> On Wed, Aug 28, 2024 at 2:15 PM Damien Le Moal <dlemoal@xxxxxxxxxx> wrote:
>>
>> On 8/28/24 17:28, Ajay Kaher wrote:
>>> The commit eca2040972b4 introduced a backward compatibility issue in
>>> the function ioprio_check_cap.
>>>
>>> Before the change, if ioprio contains a level greater than 0x7, it was
>>> treated as -EINVAL:
>>>
>>> data = ioprio & 0x1FFF
>>> if data >= 0x7, return -EINVAL
>>>
>>> Since the change, if ioprio contains a level greater than 0x7 say 0x8
>>> it is calculated as 0x0:
>>>
>>> level = ioprio & 0x7
>>>
>>> To maintain backward compatibility the kernel should return -EINVAL in
>>> the above case as well.
>>>
>>> Fixes: eca2040972b4 ("scsi: block: ioprio: Clean up interface definition")
>>> Signed-off-by: Ajay Kaher <ajay.kaher@xxxxxxxxxxxx>
>>> ---
>>> block/ioprio.c | 11 ++++++++++-
>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/ioprio.c b/block/ioprio.c
>>> index 73301a2..f08e76b 100644
>>> --- a/block/ioprio.c
>>> +++ b/block/ioprio.c
>>> @@ -30,6 +30,15 @@
>>> #include <linux/security.h>
>>> #include <linux/pid_namespace.h>
>>>
>>> +static inline int ioprio_check_level(int ioprio, int max_level)
>>> +{
>>> + int data = IOPRIO_PRIO_DATA(ioprio);
>>> +
>>> + if (IOPRIO_BAD_VALUE(data, max_level))
>>> + return -EINVAL;
>>
>> No, this cannot possibly work correctly because the prio level part of the prio
>> data is only 3 bits, so 0 to 7. The remaining 10 bits of the prio data are used
>> for priority hints (IOPRIO_HINT_XXX).
>>
>> Your change will thus return an error for cases where the prio data has a level
>> AND also a hint (e.g. for command duration limits). This change would break
>> command duration limits. So NACK.
>>
>> The userspace header file has the ioprio_value() that a user should use to
>> construct an ioprio. Bad values are checked in that function and errors will be
>> returned if an invalid level is passed.
>>
>
> OK. Thanks for the detailed explanation.
>
> I agree, to use unused bits, functionality (return value in this case)
> will be changed. If applications are built using Kernel headers of
> v6.1 (doesn't include eca2040972b4) and later only upgrading Kernel to
> v6.6, because of the changes in return values applications may have
> some sort of regression.

Which would mean that the application was not first fixed to handle the error
return, and so it would still being doing the wrong thing on 6.6 as well as on
6.1. So I do not see any regression here...

> To make the software backward compatible I believe, unused bits should
> always be ignored. So that if in future someone uses it, it should not
> change the behaviour (return values) of existing software.

There are no unused bits anymore since kernel 6.5: hints = 10 bits, level = 3
bits, and class = 3 bits, for a total of 16-bits for the entire ioprio.

So if an application attempt to set an invalid level that uses more than 3-bits,
the extra bits will be ignored in the kernel and such bad level value can be
caught only if the application uses the ioprio_value() helper defined in
include/uapi/linux/ioprio.h.

And note that before 6.5, checking the prio level was also far from perfect: if
the user tried to set the prio level to a value using more than 13-bits, the
extra bits would be ignored too and the bad value would NOT necessarily cause an
error (e.g. using 1 << 13 = 8192 as the level would result in the kernel seeing
level = 0 and not throw an error). The introduction of ioprio_value() allows
catching these issues, if the application use it.

--
Damien Le Moal
Western Digital Research