Re: [RESENT PATCH] x86/mem: fix the offset overflow when read/write mem

From: zhong jiang
Date: Wed May 03 2017 - 02:44:20 EST


On 2017/5/3 4:54, David Rientjes wrote:
> On Thu, 27 Apr 2017, zhongjiang wrote:
>
>> From: zhong jiang <zhongjiang@xxxxxxxxxx>
>>
>> Recently, I found the following issue, it will result in the panic.
>>
>> [ 168.739152] mmap1: Corrupted page table at address 7f3e6275a002
>> [ 168.745039] PGD 61f4a1067
>> [ 168.745040] PUD 61ab19067
>> [ 168.747730] PMD 61fb8b067
>> [ 168.750418] PTE 8000100000000225
>> [ 168.753109]
>> [ 168.757795] Bad pagetable: 000d [#1] SMP
>> [ 168.761696] Modules linked in: intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd iTCO_wdt glue_helper cryptd sg iTCO_vendor_support i7core_edac edac_core shpchp lpc_ich i2c_i801 pcspkr mfd_core acpi_cpufreq ip_tables xfs libcrc32c sd_mod igb ata_generic ptp pata_acpi pps_core mptsas ata_piix scsi_transport_sas i2c_algo_bit libata mptscsih i2c_core serio_raw crc32c_intel bnx2 mptbase dca dm_mirror dm_region_hash dm_log dm_mod
>> [ 168.805983] CPU: 15 PID: 10369 Comm: mmap1 Not tainted 4.11.0-rc2-327.28.3.53.x86_64+ #345
>> [ 168.814202] Hardware name: Huawei Technologies Co., Ltd. Tecal RH2285 /BC11BTSA , BIOS CTSAV036 04/27/2011
>> [ 168.825704] task: ffff8806207d5200 task.stack: ffffc9000c340000
>> [ 168.831592] RIP: 0033:0x7f3e622c5360
>> [ 168.835147] RSP: 002b:00007ffe2bb7a098 EFLAGS: 00010203
>> [ 168.840344] RAX: 00007ffe2bb7a0c0 RBX: 0000000000000000 RCX: 00007f3e6275a000
>> [ 168.847439] RDX: 00007f3e622c5360 RSI: 00007f3e6275a000 RDI: 00007ffe2bb7a0c0
>> [ 168.854535] RBP: 00007ffe2bb7a4e0 R08: 00007f3e621c3d58 R09: 000000000000002d
>> [ 168.861632] R10: 00007ffe2bb79e20 R11: 00007f3e622fbcb0 R12: 00000000004005d0
>> [ 168.868728] R13: 00007ffe2bb7a5c0 R14: 0000000000000000 R15: 0000000000000000
>> [ 168.875825] FS: 00007f3e62752740(0000) GS:ffff880627bc0000(0000) knlGS:0000000000000000
>> [ 168.883870] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 168.889583] CR2: 00007f3e6275a002 CR3: 0000000622845000 CR4: 00000000000006e0
>> [ 168.896680] RIP: 0x7f3e622c5360 RSP: 00007ffe2bb7a098
>> [ 168.901713] ---[ end trace ef98fa9f2a01cbc6 ]---
>> [ 168.90630 arch/x86/kernel/smp.c:127 native_smp_send_reschedule+0x3f/0x50
>> [ 168.935410] Modules linked in: intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd iTCO_wdt glue_helper cryptd sg iTCO_vendor_support i7core_edac edac_core shpchp lpc_ich i2c_i801 pcspkr mfd_core acpi_cpufreq ip_tables xfs libcrc32c sd_mod igb ata_generic ptp pata_acpi pps_core mptsas ata_piix scsi_transport_sas i2c_algo_bit libata mptscsih i2c_core serio_raw crc32c_intel bnx2 mptbase dca dm_mirror dm_region_hash dm_log dm_mod
>> [ 168.979686] CPU: 15 PID: 10369 Comm: mmap1 Tainted: G D 4.11.0-rc2-327.28.3.53.x86_64+ #345
>> [ 168.989114] Hardware name: Huawei Technologies Co., Ltd. Tecal RH2285 /BC11BTSA , BIOS CTSAV036 04/27/2011
>> [ 169.000616] Call Trace:
>> [ 169.003049] <IRQ>
>> [ 169.005050] dump_stack+0x63/0x84
>> [ 169.008348] __warn+0xd1/0xf0
>> [ 169.011297] warn_slowpath_null+0x1d/0x20
>> [ 169.015282] native_smp_send_reschedule+0x3f/0x50
>> [ 169.019961] resched_curr+0xa1/0xc0
>> [ 169.023428] check_preempt_curr+0x70/0x90
>> [ 169.027415] ttwu_do_wakeup+0x1e/0x160
>> [ 169.031142] ttwu_do_activate+0x77/0x80
>> [ 169.034956] try_to_wake_up+0x1c3/0x430
>> [ 169.038771] default_wake_function+0x12/0x20
>> [ 169.043019] __wake_up_common+0x55/0x90
>> [ 169.046833] __wake_up_locked+0x13/0x20
>> [ 169.050649] ep_poll_callback+0xbb/0x240
>> [ 169.054550] __wake_up_common+0x55/0x90
>> [ 169.058363] __wake_up+0x39/0x50
>> [ 169.061574] wake_up_klogd_work_func+0x40/0x60
>> [ 169.065994] irq_work_run_list+0x4d/0x70
>> [ 169.069895] irq_work_tick+0x40/0x50
>> [ 169.073452] update_process_times+0x42/0x60
>> [ 169.077612] tick_periodic+0x2b/0x80
>> [ 169.081166] tick_handle_periodic+0x25/0x70
>> [ 169.085326] local_apic_timer_interrupt+0x35/0x60
>> [ 169.090004] smp_apic_timer_interrupt+0x38/0x50
>> [ 169.094507] apic_timer_interrupt+0x93/0xa0
>> [ 169.098667] RIP: 0010:panic+0x1f5/0x239
>> [ 169.102480] RSP: 0000:ffffc9000c343dd8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
>> [ 169.110010] RAX: 0000000000000034 RBX: 0000000000000000 RCX: 0000000000000006
>> [ 169.117106] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff880627bcdfe0
>> [ 169.124201] RBP: ffffc9000c343e48 R08: 00000000fffffffe R09: 0000000000000395
>> [ 169.131298] R10: 0000000000000005 R11: 0000000000000394 R12: ffffffff81a0c475
>> [ 169.138395] R13: 0000000000000000 R14: 0000000000000000 R15: 000000000000000d
>> [ 169.145491] </IRQ>
>> [ 169.147578] ? panic+0x1f1/0x239
>> [ 169.150789] oops_end+0xb8/0xd0
>> [ 169.153910] pgtable_bad+0x8a/0x95
>> [ 169.157294] __do_page_fault+0x3aa/0x4a0
>> [ 169.161194] do_page_fault+0x30/0x80
>> [ 169.164750] ? do_syscall_64+0x175/0x180
>> [ 169.168649] page_fault+0x28/0x30
>>
>> the following case can reproduce the issue.
>>
>> int mem_fd = 0;
>> char rw_buf[1024];
>> unsigned char * map_base_s;
>> unsigned long show_addr = 0x100000000000;
>> unsigned long show_len = 0x10;
>>
>> if(argc !=2 )
>> {
>> printf( "%s show_addr\n", argv[0] );
>> return 0;
>> }
>> else
>> {
>> char *stop;
>> show_addr = strtoul( argv[1], &stop, 0 );
>> printf("show_addr= 0x%lu\n", show_addr );
>> }
>>
>> mem_fd = open(DEV_NAME, O_RDONLY);
>> if (mem_fd == -1)
>> {
>> printf("open %s failed.", DEV_NAME);
>> return 0;
>> }
>>
>> map_base_s = mmap(NULL, show_len, PROT_READ, MAP_SHARED, mem_fd, show_addr);
>> if ((long)map_base_s == -1)
>> {
>> printf("input address map to user space fail!\n");
>> return 0;
>> }
>> else
>> {
>> printf("mmap successfull!\n");
>> }
>>
>> memcpy( rw_buf, map_base_s, show_len );
>>
>> The pgoff is enough large, it exceed the size of the real memory.
>> and the mmap can return the success.
>>
>> I fix it by checking the conditions. it can make it suitable for
>> the mapped and use.
>>
>> Signed-off-by: zhong jiang <zhongjiang@xxxxxxxxxx>
>> ---
>> drivers/char/mem.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/char/mem.c b/drivers/char/mem.c
>> index 7e4a9d1..3a765e02 100644
>> --- a/drivers/char/mem.c
>> +++ b/drivers/char/mem.c
>> @@ -55,7 +55,7 @@ static inline int valid_phys_addr_range(phys_addr_t addr, size_t count)
>>
>> static inline int valid_mmap_phys_addr_range(unsigned long pfn, size_t size)
>> {
>> - return 1;
>> + return (pfn << PAGE_SHIFT) + size <= __pa(high_memory);
>> }
>> #endif
>>
> I suppose you are correct that there should be some sanity checking on the
> size used for the mmap(). Are you using the default size in your test
> program or are you passing it a different size to induce the panic?
Just use the default size.
> There's also the issue of architectures that override this function, such
> as on sh that simply returns 1, so I think that would need to be addressed
> as well.
yes
> Let's add a few people to the cc.
>
> .
I get less feedback so far, but It indeed will induce the panic.

Thanks
zhongjiang
>