Re: First kernel patch (optimization)

From: Jaime Arrocha
Date: Wed Sep 16 2015 - 22:05:28 EST



On 09/16/2015 07:56 AM, David Laight wrote:
From: Austin S Hemmelgarn
Sent: 16 September 2015 12:46
On 2015-09-15 20:09, Steve Calfee wrote:
On Tue, Sep 15, 2015 at 12:53 PM, Eric Curtin <ericcurtin17@xxxxxxxxx> wrote:
Signed-off-by: Eric Curtin <ericcurtin17@xxxxxxxxx>

diff --git a/tools/usb/usbip/src/usbip_detach.c b/tools/usb/usbip/src/usbip_detach.c
index 05c6d15..9db9d21 100644
--- a/tools/usb/usbip/src/usbip_detach.c
+++ b/tools/usb/usbip/src/usbip_detach.c
@@ -47,7 +47,9 @@ static int detach_port(char *port)
uint8_t portnum;
char path[PATH_MAX+1];

-
+ unsigned int port_len = strlen(port);
+
+ for (unsigned int i = 0; i < port_len; i++)
if (!isdigit(port[i])) {
err("invalid port %s", port);
return -1;

--
Hi Eric,

This is fine, but what kind of wimpy compiler optimizer will not move
the constant initializer out of the loop? I bet if you compare binary
sizes/code it will be exactly the same, and you added some characters
of code. Reorganizing code for readability is fine, but for compiler
(in)efficiency seems like a bad idea.
While I agree with your argument, I would like to point out that it is a
well established fact that GCC's optimizers are kind of brain-dead at
times and need their hands held.

I'd be willing to bet that the code will be marginally larger (because
of adding another variable), but might run slightly faster too (because
in my experience, GCC doesn't always catch things like this), and should
compile a little faster (because the optimizers don't have to do as much
work).
The compiler probably can't optimise the strlen().
If isdigit() is a real function (the locale specific one probably is)
then the compile cannot assume that port[n] isn't changed by the call
to isdigit.

A simpler change would be:
for (unsigned int i = 0; port[i] != 0; i++)

Much better would be to use strtoul() instead of atoi().

David

I actually took some time to verify this. GCC makes this optimization with -O2 at least on gcc 4.7.2.
One interesting observation I found was that in O0 and O2, it does make a call to strlen while in O1 it calculates
the length of the string using:

repnz scas %es:(%rdi),%al
not %rcx
sub $0x2,%rcx

Why does it do that? Is the code above faster? If yes, why not do it in O2 too?
Is this still a topic for this forum?


gcc version 4.7.2 (Debian 4.7.2-5)
code

void conv_input(char *port)
{
int portnum;

for(int i = 0; i <strlen(port); i++)
if(!isdigit(port[i])) {
printf("invalid port %s", port);
exit (1);
}

portnum = atoi(port);
printf("Port number: %d\n", portnum);
}

Optimization done?
O0 O1 O2
x86 No No Yes
amd64 No No Yes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/