Re: [PATCH v3 2/2] hexdump: add a new dump prefix DUMP_PREFIX_CUSTOM

From: Leizhen (ThunderTown)
Date: Wed Aug 16 2023 - 07:25:49 EST




On 2023/8/16 17:04, Petr Mladek wrote:
> On Wed 2023-08-16 11:20:32, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2023/8/15 22:30, Petr Mladek wrote:
>>> Added Kees and hardening mailing list into Cc.
>>>
>>> On Fri 2023-08-11 15:49:21, thunder.leizhen@xxxxxxxxxxxxxxx wrote:
>>>> From: Zhen Lei <thunder.leizhen@xxxxxxxxxx>
>>>>
>>>> Currently, function print_hex_dump() supports three dump prefixes:
>>>> DUMP_PREFIX_NONE, DUMP_PREFIX_ADDRESS and DUMP_PREFIX_OFFSET. But for some
>>>> usage scenarios, they don't work perfectly. For example, dump the content
>>>> of one task's stack. In order to quickly identify a stack frame,
>>>> DUMP_PREFIX_ADDRESS is preferred. But without boot option no_hash_pointers
>>>> , DUMP_PREFIX_ADDRESS just print the 32-bit hash value.
>>>>
>>>> dump memory at sp=ffff800080903aa0:
>>>> 00000000a00a1d32: 80903ac0 ffff8000 8feeae24 ffffc356
>>>> 000000007993ef27: 9811c000 ffff0d98 8ad2e500 ffff0d98
>>>> 00000000b1a0b2de: 80903b30 ffff8000 8ff3a618 ffffc356
>>>> ... ...
>>>> 00000000a7a9048b: 9810b3c0 ffff0d98 00000000 00000000
>>>> 0000000011cda415: 80903cb0 ffff8000 00000000 00000000
>>>> 000000002dbdf9cd: 981f8400 ffff0d98 00000001 00000000
>>>>
>>>> On the other hand, printing multiple 64-bit addresses is redundant when
>>>> the 'sp' value is already printed. Generally, we do not dump more than
>>>> 64 KiB memory. It is sufficient to print only the lower 16 bits of the
>>>> address.
>>>>
>>>> dump memory at sp=ffff800080883a90:
>>>> 3a90: 80883ac0 ffff8000 3d8e936c ffffbd5b
>>>> 3aa0: 5833f000 ffff3580 00000001 00000000
>>>> 3ab0: 40299840 ffff3580 590dfa00 ffff3580
>>>> 3ac0: 80883b30 ffff8000 3d938b28 ffffbd5b
>>>> 3ad0: 40877180 ffff3580 590dfa00 ffff3580
>>>> 3ae0: 4090f600 ffff3580 80883cb0 ffff8000
>>>> 3af0: 00000010 00000000 00000000 00000000
>>>> 3b00: 4090f700 ffff3580 00000001 00000000
>>>>
>>>> Let's add DUMP_PREFIX_CUSTOM, allows users to make some adjustments to
>>>> their needs.
>>>>
>>>> For example:
>>>> pr_info("dump memory at sp=%px:\n", sp);
>>>> print_hex_dump(KERN_INFO, "%s%16hx: %s\n",
>>>> DUMP_PREFIX_CUSTOM, 16, 1, sp, 16, false);
>>>> print_hex_dump(KERN_INFO, "%s%16x: %s\n",
>>>> DUMP_PREFIX_CUSTOM, 16, 1, sp, 16, false);
>>>> print_hex_dump(KERN_INFO, "%s%px: %s\n",
>>>> DUMP_PREFIX_CUSTOM, 16, 1, sp, 16, false);
>>>
>>> IMHO, this is pretty bad interface.
>>>
>>> + From the user POV:
>>>
>>> It is far from clear what values will be passed for the given
>>> printf format. It can be docummented but...
>>>
>>>
>>> + From the security POV:
>>>
>>> The compiler could not check if the printk() parameters
>>> match the format. I mean if the number of types of
>>> the parameters are correct.
>>
>> Yes, it has these problems. So, back to v2, how about add DUMP_PREFIX_ADDRESS_LOW16?
>> Or named DUMP_PREFIX_ADDR16 or others. Or change the format of DUMP_PREFIX_ADDRESS
>> from "%p" to "%px",Or add DUMP_PREFIX_RAWADDR. Or keep the status quo.
>
> I would personally keep the status quo.
>
> First, raw address should not be printed unless no_hash_pointers is
> used. Otherwise, it would be a security hole. Yes, printing the last
> 16b is less risky. But it is still part of the puzzle. And for
> debugging, people want/need to set "no_hash_pointers" anyway.
>
> Second, IMHO, printing only the last 16b of the address is not much
> useful for debugging. It can't be compared easily against other
> addresses.
>
> Third, 00000000a00a1d32 can be interpreted more easily than 1d32.
> I mean, it is much more obvious that it is an "hashed" address.
> Of course, this is less important for people who analyze hex
> dumps on daily basis.
>
> That said, the above would be valid when DUMP_PREFIX_ADDRESS_LOW16
> is used in plain kernel sources. I understand that it might be
> useful for temporary added debug messages. I am not sure what
> was the use case which motivated this patch.

https://www.spinics.net/lists/rcu/msg13545.html
Printing the address will save us some time. Of course, if we don't print the address,
we can manually calculate it again according to the calculation method in the code.

>
>> Also, do you have any comments on patch 1/2?
>
> I do not have strong opinion.
>
> The auto adjusting of the width makes it easier to read for humans.
> But it might complicate some post-processing using a script.

The script should split the data based on the colon and then use a function like
strtoul() to convert it. It seems that we do not have to worry too much about
the script.

>
> Best Regards,
> Petr
> .
>

--
Regards,
Zhen Lei