Re: [PATCH 5/6] ide: remove ide_execute_pkt_cmd()

From: Borislav Petkov
Date: Sun Feb 15 2009 - 12:39:21 EST


Hi,

> I still don't undestand why you assume that such variable will be
> alloceted on stack -- gcc has 3 registers available for local variables
> (which doesn't have to save across function calls). However, the
> register variables have to take stack space indeed as they need to be
> saved on funciton entry... though I'm not sure that gcc will necessary
> put such variable in one of those 3 registers if it figures out that
> there are no function calls going to happen during its life time.
>
>> and the code is more readable. A win-win situation, I'd say :).
>>
>
> You haven't presented the code which gets generated when the local
> variable is used, so it's impossible to compare.

Here's another example from ide-disk.c where you have stack variables cashing
those flags checks:

<ide-disk.c>
static ide_startstop_t __ide_do_rw_disk(ide_drive_t *drive, struct request *rq,
sector_t block)
{
ide_hwif_t *hwif = drive->hwif;
u16 nsectors = (u16)rq->nr_sectors;
u8 lba48 = !!(drive->dev_flags & IDE_DFLAG_LBA48);
u8 dma = !!(drive->dev_flags & IDE_DFLAG_USING_DMA);
ide_task_t task;
struct ide_taskfile *tf = &task.tf;
ide_startstop_t rc;

if ((hwif->host_flags & IDE_HFLAG_NO_LBA48_DMA) && lba48 && dma) {
if (block + rq->nr_sectors > 1ULL << 28)
dma = 0;
else
lba48 = 0;
}
</ide-disk.c>

Corresponding asm (this time i386 but I don't think it matters since we
need at least one arch to prove my point).

<ide-disk.s>
.LVL48:
.loc 1 94 0
movl %eax, %edx # D.32119, tmp93
.loc 1 95 0
shrl %eax # D.32119
andb $1, %al #,
movb %al, -58(%ebp) #, dma
.LVL49:
.loc 1 100 0
movl -52(%ebp), %eax # hwif,
.loc 1 94 0
shrl $21, %edx #, tmp93
andb $1, %dl #,
movb %dl, -57(%ebp) #, lba48
.LVL50:
.loc 1 100 0
testb $4, 90(%eax) #, <variable>.host_flags
je .L37 #,
testb %dl, %dl #
je .L37 #,
cmpb $0, -58(%ebp) # dma
movb $1, -57(%ebp) #, lba48
.LVL51:
</ide-disk.s>

Now look at the last lines at labels .LVL48 and .LVL49 - they both save
those 1-byte u8's called dma and lba48 on the stack at -57(%ebp) and
-58(%ebp), respectively. And guess what, later on label LVL50 they get
accessed in the check. And several times more later, which in most sane
architectures still means cache accesses but when you have registers its
even faster :).

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/