Re: Lock up when faking MMIO read[bwl] on some machines [WAS: FakingMMIO ops? Fooling a driver]

From: RafaÅ MiÅecki
Date: Sat Jun 18 2011 - 09:11:41 EST


W dniu 18 czerwca 2011 14:03 uÅytkownik Pekka Paalanen <pq@xxxxxx> napisaÅ:
> On Sat, 18 Jun 2011 13:26:14 +0200
> RafaÅ MiÅecki <zajec5@xxxxxxxxx> wrote:
>
>> W dniu 18 czerwca 2011 12:57 uÅytkownik RafaÅ MiÅecki
>> <zajec5@xxxxxxxxx> napisaÅ:
>> > Not modified MMIO tracing works great on this machine, I've
>> > grabbed dumps 10-20 times without a lock up or anything.
>> >
>> > I'm using different drivers on both machines, because Macbook
>> > Pro 8,1 has unique BCM4331 card that I can not buy and that is
>> > not available with PCI(e) slot. Is uses some vendor specific,
>> > PCIe compatible slot. Simple commenting out "set_ins_reg_val"
>> > work fine on this Macbook, PHY reads are tracked correctly.
>> >
>> > As for differences in struct pt_regs... yeah, I think that
>> > happens. I'm using x86 kernel, while on Macbook we use x86_64
>> > as it's required to use 64bit driver in ndiswrapper.
>> >
>> > I can try to find out, which register we try to overwrite on
>> > Macbook.
>>
>> This is what does happen on my machine (working):
>> [ Â122.550991] mmiotrace: ZAJEC: read PHY 0x20
>> [ Â122.550994] mmiotrace: ZAJEC: overwriting 0x20 with 0xFFFF
>> [ Â122.550997] [ZAJEC] setting AX with 0xFFFF
>> (...)
>> [ Â122.551071] mmiotrace: ZAJEC: read PHY 0x22
>> [ Â122.551074] mmiotrace: ZAJEC: overwriting 0x22 with 0xFFFF
>> [ Â122.551077] [ZAJEC] setting AX with 0xFFFF
>> (...)
>> [ Â122.551198] mmiotrace: ZAJEC: read PHY 0x27
>> [ Â122.551201] mmiotrace: ZAJEC: overwriting 0x27 with 0xFFFF
>> [ Â122.551204] [ZAJEC] setting AX with 0xFFFF
>>
>>
>> This is what does happen on Macbook:
>> [ Â166.886438] mmiotrace: ZAJEC: read PHY 0x810
>> [ Â166.886649] mmiotrace: ZAJEC: overwriting 0x810 with 0xFFFF
>> [ Â166.886860] [ZAJEC] setting AX with 0xFFFF
>> LOCK UP
>>
>>
>> So on both machines we modify AX register in the same place. My
>> function set_ins_reg_val is a copy of get_ins_reg_val which works
>> fine... So no idea what may we be doing wrong on this Macbook
>> x86_64...
>
> Ok, so it is a 32 vs. 64 bit arch difference, or difference in
> driver binary. AX on 64-bit is actually RAX... well, depending
> on data width.
>
> I actually missed you patch attachment before, sorry.
>
> I have minor notes, but I cannot see them being a reason for a
> lockup:

Thanks for answer!


> - instead of set_reg_w32(), you should be able to simply
> *get_reg_w32() = (unsigned long)value; or equivalent since
> it returns a pointer.

Agree.


> - you are not checking the data access width, but you assume
> it is 32 bits. Maybe you should verify that? get_reg_w8() is
> very different. I think you should reproduce the switch on
> get_ins_reg_width() statement from get_ins_reg_val() in
> your set_ins_reg_val(), and use unsigned long instead of u32
> to account for 64-bitness.
>
> Yes, get_reg_w32() is a little badly named.
>
> Maybe the driver is doing a 16-bit wide access, and happens to
> store something else in the other 16/48 bits of RAX?

Nice comment, thanks! I've decided to print info about planned
overwrite, instead of really doing it. Plus few more debugging
messages suggested by David:
[ 293.682242] mmiotrace: ZAJEC: overwriting 0x810 with 0xFFFF
[ 293.687929] mmiotrace: [ZAJEC] ins at ffffc90010503b60: 0f b7 81 fe 03 00
[ 293.693651] mmiotrace: [ZAJEC] p = (unsigned char *)ins_addr;
p == 0xf
[ 293.699379] mmiotrace: [ZAJEC] p += skip_prefix(p, &prf);
p == 0xf
[ 293.705116] mmiotrace: [ZAJEC] p += get_opcode(p, &opcode);
p == 0x81
[ 293.710815] mmiotrace: [ZAJEC] opcode is 0xb70f
[ 293.716411] mmiotrace: [ZAJEC] prf info: shorted:0; enlarged:0, rexr:0, rex:0
[ 293.722062] mmiotrace: [ZAJEC] after checing opcode we decided to use reg 0x0
[ 293.727698] mmiotrace: [ZAJEC] width is 4
[ 293.733219] [ZAJEC] (not) setting AX with 0xFFFF

So the read's width is 4, that's 32bit. If this is 64bit arch we could
be overwritting another 32bits in AX register, right? Can this be our
issue?


> I assume the lockup is silent, since you have not shown
> anything. Have you tried a serial console, if you have one?

I think David has some USB debugging (what ever it means...). Using
console mode just printed commands done right before overwritting
register.

--
RafaÅ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/