Re: [PATCH v2 0/2] hexdump: Allow skipping identical lines
From: Andy Shevchenko
Date: Mon Jan 13 2025 - 07:40:22 EST
On Fri, Jan 10, 2025 at 07:42:03PM +0100, Miquel Raynal wrote:
> While working on NAND issues, I used print_hex_dump() a lot to compare
> data. But I am mostly working on embedded systems where the kernel
> messages go through a serial console. Sometimes network support is an
> option, sometimes not. Anyway, I often print buffers both in kernel
> space and user space to compare them, and they may be full of 0's or
> 1's, which means lines are repeated a lot in the output and this is slow
> *and* hard to compare.
>
> I initially hacked into lib/hexdump.c for my own purpose and just
> discarded all the other users, but it felt like this might be a useful
> feature for others and decided to make it a public patch.
>
> * First patch changes the "ascii" parameter into a "flags" variable now
> accepting the value: DUMP_FLAG_ASCII.
> * Second patch adds a new flag to skip the identical lines, because this
> must be an opt-in parameter, I guess.
>
> The patch series has successfully gone through a round of
> kernel-test-robot.
>
> The Cc-list, as provided by get_maintainers.pl, was returning 330
> e-mail addresses which felt to much, so I ran the script only on the
> second patch (the printk/includes/debug/Doc changes). It gave this
> Cc-list which sounds more reasonable. Hopefully this is a smart move,
> otherwise let me know what you think would be best.
...
> 000007e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
> And with the new flag added the code looks like this:
>
> print_hex_dump_debug("", DUMP_PREFIX_OFFSET, 32, 1, spinand->databuf, mtd->writesize,
> - 0);
> + DUMP_FLAG_SKIP_IDENTICAL_LINES);
>
> And the output is easier to parse and also faster to show on a serial
> console:
>
> 00000000: 55 42 49 23 01 00 00 00 00 00 00 00 00 00 00 01 00 00 08 00 00 00 10 00 2b 10 f1 92 00 00 00 00
> 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 db 93 e9 fc
> 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> *
I see disadvantage in the output here, i.e. there is no visibility of how many
identical lines (bytes) were actually dumped. And IIRC the hexdump(1) behaviour
it prints last line no matter what.
--
With Best Regards,
Andy Shevchenko