Re: [PATCH v1] scripts: leaking_addresses.pl: add support for 32-bit kernel addresses

From: Tobin C. Harding
Date: Tue Nov 21 2017 - 18:59:31 EST


You don't typically need [xxx v1] for version 1, the v1 is implicit.

Please use the git brief description prefix that is already in use i.e

leaking_addresses: add support for 32-bit kernel addresses

On Tue, Nov 21, 2017 at 01:28:14PM +0530, kaiwan.billimoria@xxxxxxxxx wrote:
> The current leaking_addresses.pl script only supports showing "leaked"
> 64-bit kernel virtual addresses. This patch adds support for showing
> "leaked" 32-bit kernel virtual addresses.

Thanks for the patch. I like your ideas. Here are a few nitpicks to help
us get your work merged.

> The way it currently works- once it detects we're running on an i'x'86 platform
> (where x=3|4|5|6), it takes this arch into account for checking: the essential
> rationale:
> virt-addr >= PAGE_OFFSET => it's a kernel virtual address.
>
> Note-
> 1. It's a work in progress; some pending TODOs:

We don't have 'work in progress' in the kernel, everything is a work in
progress. We just have RFC or PATCH.

> - support for ARM-32

Sure, we can do this later.

> - programatically query and set the PAGE_OFFSET based on arch (it's currently
> hard-coded)

Let's do this straight away, it will be much nicer.

> 2. Minor edit:
> the '--raw', '--suppress-dmesg', '--squash-by-path' and
> '--squash-by-filename' option switches are only meaningful
> when the '----input-raw=' option is used. So, indent the 'Help' screen lines
> to reflect the fact.

This is a different change to the architecture stuff so should be in a
separate patch. You could do both as a series if you like. Off the top
of my head I have never seen options output like this, but if you have,
I'm willing to accept your view. You are correct that the options
mentioned are only use in conjuncture with '--input-raw' so some way of
showing this would be nice.

> Feedback welcome..

Here it comes ;)

> Signed-off-by: Kaiwan N Billimoria <kaiwan.billimoria@xxxxxxxxx>
> ---
> diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl
> index bc5788000018..e139de445ad1 100755
> --- a/scripts/leaking_addresses.pl
> +++ b/scripts/leaking_addresses.pl
> @@ -12,7 +12,10 @@
> #
> # You may like to set kptr_restrict=2 before running script
> # (see Documentation/sysctl/kernel.txt).
> -
> +#
> +# 32-bit kernel address support : Kaiwan N Billimoria
> +# <kaiwan.billimoria@xxxxxxxxx>

Cool. Can you put this up the top near the other copyright information
please.

Perhaps

# (c) 2017 Kaiwan N Billimoria <kaiwan.billimoria@xxxxxxxxx> (ix86 support)

> use warnings;
> use strict;
> use POSIX;
> @@ -35,7 +38,7 @@ my $TIMEOUT = 10;
> # Script can only grep for kernel addresses on the following architectures. If
> # your architecture is not listed here and has a grep'able kernel address please
> # consider submitting a patch.
> -my @SUPPORTED_ARCHITECTURES = ('x86_64', 'ppc64');
> +my @SUPPORTED_ARCHITECTURES = ('x86_64', 'ppc64', 'i[3456]86');
>
> # Command line options.
> my $help = 0;
> @@ -48,6 +51,9 @@ my $suppress_dmesg = 0; # Don't show dmesg in output.
> my $squash_by_path = 0; # Summary report grouped by absolute path.
> my $squash_by_filename = 0; # Summary report grouped by filename.
>
> +my $bit_size = 64; # Check 64-bit kernel addresses by default

This global is unnecessary. You already have is_ix86_32() so you can just
use that.

> +my $PAGE_OFFSET_32BIT = 0xc0000000;

As commented above, can we do this programmatically like previously discussed?

> # Do not parse these files (absolute path).
> my @skip_parse_files_abs = ('/proc/kmsg',
> '/proc/kcore',
> @@ -97,10 +103,10 @@ Options:
>
> -o, --output-raw=<file> Save results for future processing.
> -i, --input-raw=<file> Read results from file instead of scanning.
> - --raw Show raw results (default).
> - --suppress-dmesg Do not show dmesg results.
> - --squash-by-path Show one result per unique path.
> - --squash-by-filename Show one result per unique filename.
> + --raw Show raw results (default).
> + --suppress-dmesg Do not show dmesg results.
> + --squash-by-path Show one result per unique path.
> + --squash-by-filename Show one result per unique filename.
> -d, --debug Display debugging output.
> -h, --help, --version Display this help and exit.

Commented on above.

> @@ -177,7 +183,7 @@ sub dprint
>
> sub is_supported_architecture
> {
> - return (is_x86_64() or is_ppc64());
> + return (is_x86_64() or is_ppc64() or is_ix86_32());
> }
>
> sub is_x86_64
> @@ -185,6 +191,7 @@ sub is_x86_64
> my $archname = $Config{archname};
>
> if ($archname =~ m/x86_64/) {
> + $bit_size=64;

Setting a global opaquely within an is_a_ function (or subroutine)
violates the principle of least surprise. No one would expect an is_a
function to do this. Either way, as commented above, we don't need this
global anyways.

> return 1;
> }
> return 0;
> @@ -195,6 +202,19 @@ sub is_ppc64
> my $archname = $Config{archname};
>
> if ($archname =~ m/powerpc/ and $archname =~ m/64/) {
> + $bit_size=64;

As above.

> + return 1;
> + }
> + return 0;
> +}
> +
> +# 32-bit x86: is_i'x'86_32() ; where x is [3 or 4 or 5 or 6]

Perhaps

# true for Intel x86 32 bit architectures, x386 and above.

> +sub is_ix86_32
> +{
> + my $archname = $Config{archname};
> +
> + if ($archname =~ m/i[3456]86-linux/) {
> + $bit_size=32;

Please use kernel coding style

$bit_size = 32;

> return 1;
> }
> return 0;
> @@ -215,6 +235,15 @@ sub is_false_positive
> $match =~ '\bf{10}601000\b') {
> return 1;
> }
> + } elsif ($bit_size == 32) {

} elsif (is_ix86_32())

Bonus points, you uncovered a bug in the current script `if (is_x86_64)`
was missing the parenthesis!

# leaking_addresses.pl: Scan 64 bit kernel for potential leaking addresses.
@@ -209,7 +211,7 @@ sub is_false_positive
return 1;
}

- if (is_x86_64) {
+ if (is_x86_64()) {


Thanks. I'm guessing this is what led you to using the global :) Have
patched dev branch (branch name: leaks).

> + my $addr32 = eval hex($match);
> + if ($addr32 < $PAGE_OFFSET_32BIT ) {
> + return 1;
> + }

Nice.

> + if ($match =~ '\b(0x)?(f|F){8}\b') {
> + return 1;
> + }

So, may_leak_address() and is_false_positive() are tightly coupled and
not really that nice. Once we add 32 bit support it gets worse. Going
forwards, we can either add your 32 bit work then refactor both
functions or you can refactor them as you add the 32 bit stuff. I'm open
to either. Some things to note

- The mask stuff (all 1's) should have an all 0's regex also.
- The mask stuff should probably be closer to the mask stuff for 64
bit. It's not immediately apparent a clean way to do this though.
- It's not immediately apparent if an address less that PAGE_OFFSET is a
false positive or should be caught in leaks_address().
- Do we need 32 bit equivalents for

if ($line =~ '\bKEY=[[:xdigit:]]{14} [[:xdigit:]]{16} [[:xdigit:]]{16}\b' or
$line =~ '\b[[:xdigit:]]{14} [[:xdigit:]]{16} [[:xdigit:]]{16}\b') {

Even though this is a false positive it needs to be where it is (note
the 'tightly coupled' statement above).

> }
>
> return 0;
> @@ -243,6 +272,8 @@ sub may_leak_address
> $address_re = '\b(0x)?ffff[[:xdigit:]]{12}\b';
> } elsif (is_ppc64()) {
> $address_re = '\b(0x)?[89abcdef]00[[:xdigit:]]{13}\b';
> + } elsif (is_ix86_32()) {
> + $address_re = '\b(0x)?[[:xdigit:]]{8}\b';
> }
>
> while (/($address_re)/g) {

Your patch did not apply, the problem looks to be in the code section
above. You can see that there is no removed line. For next spin please
check your patch applies on top of the 'leaks' branch (which now
includes the fix for the bug you found).

I have one more comment that should have been at the top but I did not
want to confuse things. Typically, the git brief description should be
limited to 50 characters. If you do decide to split this patch into two
and use the prefix suggested you may like to change the git brief
description but don't feel you have to. If you do decide to do this, your
next patch set will be a version 1 again. I may be wrong but I never
increment a patch version if the subject line changes (excluding
contents of [] ).

I'm relatively new here so please feel free to ignore anything I've
said. I have commented in the effort to help you learn as I am doing so
also.

I'll look forward to v2. Please ask if anything I've said is unclear.


thanks,
Tobin.