Re: [2/2] fs, elf: drop MAP_FIXED usage from elf_map
From: Michal Hocko
Date: Mon Dec 18 2017 - 04:14:22 EST
On Fri 15-12-17 16:49:28, Andrei Vagin wrote:
> Hi Michal,
>
> We run CRIU tests for linux-next and the 4.15.0-rc3-next-20171215 kernel
> doesn't boot:
>
> [ 3.492549] Freeing unused kernel memory: 1640K
> [ 3.494547] Write protecting the kernel read-only data: 18432k
> [ 3.498781] Freeing unused kernel memory: 2016K
> [ 3.503330] Freeing unused kernel memory: 512K
> [ 3.505232] rodata_test: all tests were successful
> [ 3.515355] 1 (init): Uhuuh, elf segement at 00000000928fda3e requested but the memory is mapped already
Hmm, this interesting. What does the test actualy do? Could you add some
instrumentation to see what is actually mapped there? Something like
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 0e50230ce53d..1b68ddc34043 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -372,10 +372,28 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
} else
map_addr = vm_mmap(filep, addr, size, prot, type, off);
- if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
+ if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr)) {
+ struct vm_area_struct *vma;
+
pr_info("%d (%s): Uhuuh, elf segment at %p requested but the memory is mapped already\n",
task_pid_nr(current), current->comm,
(void *)addr);
+ vma = find_vma(current->mm, map_addr);
+ if (vma && vma->vm_start < addr) {
+ pr_info("requested [%lx, %lx] mapped [%lx, %lx] %lx ", addr, addr + total_size,
+ vma->vm_start, vma->vm_end, vma->vm_flags);
+ if (!vma->vm_file) {
+ pr_cont("anon\n");
+ } else {
+ char path[512];
+ char *p = file_path(vma->vm_file, path, sizeof(path));
+ if (IS_ERR(p))
+ p = "?";
+ pr_cont("\"%s\"\n", kbasename(p));
+ }
+ dump_stack();
+ }
+ }
return(map_addr);
}
> [ 3.519533] Starting init: /sbin/init exists but couldn't execute it (error -95)
> [ 3.528993] Starting init: /bin/sh exists but couldn't execute it (error -14)
> [ 3.532127] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
> [ 3.538328] CPU: 0 PID: 1 Comm: init Not tainted 4.15.0-rc3-next-20171215-00001-g6d6aea478fce #11
> [ 3.542201] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
> [ 3.546081] Call Trace:
> [ 3.547221] dump_stack+0x5c/0x79
> [ 3.548768] ? rest_init+0x30/0xb0
> [ 3.550320] panic+0xe4/0x232
> [ 3.551669] ? rest_init+0xb0/0xb0
> [ 3.553110] kernel_init+0xeb/0x100
> [ 3.554701] ret_from_fork+0x1f/0x30
> [ 3.558964] Kernel Offset: 0x2000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [ 3.564160] ---[ end Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
>
> If I revert this patch, it boots normally.
>
> Thanks,
> Andrei
>
> On Wed, Dec 13, 2017 at 10:25:50AM +0100, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@xxxxxxxx>
> >
> > Both load_elf_interp and load_elf_binary rely on elf_map to map segments
> > on a controlled address and they use MAP_FIXED to enforce that. This is
> > however dangerous thing prone to silent data corruption which can be
> > even exploitable. Let's take CVE-2017-1000253 as an example. At the time
> > (before eab09532d400 ("binfmt_elf: use ELF_ET_DYN_BASE only for PIE"))
> > ELF_ET_DYN_BASE was at TASK_SIZE / 3 * 2 which is not that far away from
> > the stack top on 32b (legacy) memory layout (only 1GB away). Therefore
> > we could end up mapping over the existing stack with some luck.
> >
> > The issue has been fixed since then (a87938b2e246 ("fs/binfmt_elf.c:
> > fix bug in loading of PIE binaries")), ELF_ET_DYN_BASE moved moved much
> > further from the stack (eab09532d400 and later by c715b72c1ba4 ("mm:
> > revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")) and excessive
> > stack consumption early during execve fully stopped by da029c11e6b1
> > ("exec: Limit arg stack to at most 75% of _STK_LIM"). So we should be
> > safe and any attack should be impractical. On the other hand this is
> > just too subtle assumption so it can break quite easily and hard to
> > spot.
> >
> > I believe that the MAP_FIXED usage in load_elf_binary (et. al) is still
> > fundamentally dangerous. Moreover it shouldn't be even needed. We are
> > at the early process stage and so there shouldn't be unrelated mappings
> > (except for stack and loader) existing so mmap for a given address
> > should succeed even without MAP_FIXED. Something is terribly wrong if
> > this is not the case and we should rather fail than silently corrupt the
> > underlying mapping.
> >
> > Address this issue by changing MAP_FIXED to the newly added
> > MAP_FIXED_SAFE. This will mean that mmap will fail if there is an
> > existing mapping clashing with the requested one without clobbering it.
> >
> > Cc: Abdul Haleem <abdhalee@xxxxxxxxxxxxxxxxxx>
> > Cc: Joel Stanley <joel@xxxxxxxxx>
> > Acked-by: Kees Cook <keescook@xxxxxxxxxxxx>
> > Reviewed-by: Khalid Aziz <khalid.aziz@xxxxxxxxxx>
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> > ---
> > arch/metag/kernel/process.c | 6 +++++-
> > fs/binfmt_elf.c | 12 ++++++++----
> > 2 files changed, 13 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/metag/kernel/process.c b/arch/metag/kernel/process.c
> > index 0909834c83a7..867c8d0a5fb4 100644
> > --- a/arch/metag/kernel/process.c
> > +++ b/arch/metag/kernel/process.c
> > @@ -399,7 +399,7 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
> > tcm_tag = tcm_lookup_tag(addr);
> >
> > if (tcm_tag != TCM_INVALID_TAG)
> > - type &= ~MAP_FIXED;
> > + type &= ~(MAP_FIXED | MAP_FIXED_SAFE);
> >
> > /*
> > * total_size is the size of the ELF (interpreter) image.
> > @@ -417,6 +417,10 @@ unsigned long __metag_elf_map(struct file *filep, unsigned long addr,
> > } else
> > map_addr = vm_mmap(filep, addr, size, prot, type, off);
> >
> > + if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
> > + pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> > + task_pid_nr(current), tsk->comm, (void*)addr);
> > +
> > if (!BAD_ADDR(map_addr) && tcm_tag != TCM_INVALID_TAG) {
> > struct tcm_allocation *tcm;
> > unsigned long tcm_addr;
> > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> > index 73b01e474fdc..5916d45f64a7 100644
> > --- a/fs/binfmt_elf.c
> > +++ b/fs/binfmt_elf.c
> > @@ -372,6 +372,10 @@ static unsigned long elf_map(struct file *filep, unsigned long addr,
> > } else
> > map_addr = vm_mmap(filep, addr, size, prot, type, off);
> >
> > + if ((type & MAP_FIXED_SAFE) && BAD_ADDR(map_addr))
> > + pr_info("%d (%s): Uhuuh, elf segement at %p requested but the memory is mapped already\n",
> > + task_pid_nr(current), current->comm, (void*)addr);
> > +
> > return(map_addr);
> > }
> >
> > @@ -569,7 +573,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
> > elf_prot |= PROT_EXEC;
> > vaddr = eppnt->p_vaddr;
> > if (interp_elf_ex->e_type == ET_EXEC || load_addr_set)
> > - elf_type |= MAP_FIXED;
> > + elf_type |= MAP_FIXED_SAFE;
> > else if (no_base && interp_elf_ex->e_type == ET_DYN)
> > load_addr = -vaddr;
> >
> > @@ -930,7 +934,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
> > * the ET_DYN load_addr calculations, proceed normally.
> > */
> > if (loc->elf_ex.e_type == ET_EXEC || load_addr_set) {
> > - elf_flags |= MAP_FIXED;
> > + elf_flags |= MAP_FIXED_SAFE;
> > } else if (loc->elf_ex.e_type == ET_DYN) {
> > /*
> > * This logic is run once for the first LOAD Program
> > @@ -966,7 +970,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
> > load_bias = ELF_ET_DYN_BASE;
> > if (current->flags & PF_RANDOMIZE)
> > load_bias += arch_mmap_rnd();
> > - elf_flags |= MAP_FIXED;
> > + elf_flags |= MAP_FIXED_SAFE;
> > } else
> > load_bias = 0;
> >
> > @@ -1223,7 +1227,7 @@ static int load_elf_library(struct file *file)
> > (eppnt->p_filesz +
> > ELF_PAGEOFFSET(eppnt->p_vaddr)),
> > PROT_READ | PROT_WRITE | PROT_EXEC,
> > - MAP_FIXED | MAP_PRIVATE | MAP_DENYWRITE,
> > + MAP_FIXED_SAFE | MAP_PRIVATE | MAP_DENYWRITE,
> > (eppnt->p_offset -
> > ELF_PAGEOFFSET(eppnt->p_vaddr)));
> > if (error != ELF_PAGESTART(eppnt->p_vaddr))
> [ 0.000000] Linux version 4.15.0-rc3-next-20171215-00001-g6d6aea478fce (avagin@laptop) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC)) #11 SMP Fri Dec 15 16:39:11 PST 2017
> [ 0.000000] Command line: root=/dev/vda2 ro debug console=ttyS0,115200 LANG=en_US.UTF-8 slub_debug=FZP raid=noautodetect selinux=0
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x008: 'MPX bounds registers'
> [ 0.000000] x86/fpu: Supporting XSAVE feature 0x010: 'MPX CSR'
> [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
> [ 0.000000] x86/fpu: xstate_offset[3]: 832, xstate_sizes[3]: 64
> [ 0.000000] x86/fpu: xstate_offset[4]: 896, xstate_sizes[4]: 64
> [ 0.000000] x86/fpu: Enabled xstate features 0x1f, context size is 960 bytes, using 'compacted' format.
> [ 0.000000] e820: BIOS-provided physical RAM map:
> [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007ffd8fff] usable
> [ 0.000000] BIOS-e820: [mem 0x000000007ffd9000-0x000000007fffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> [ 0.000000] NX (Execute Disable) protection: active
> [ 0.000000] random: fast init done
> [ 0.000000] SMBIOS 2.8 present.
> [ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
> [ 0.000000] Hypervisor detected: KVM
> [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
> [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
> [ 0.000000] e820: last_pfn = 0x7ffd9 max_arch_pfn = 0x400000000
> [ 0.000000] MTRR default type: write-back
> [ 0.000000] MTRR fixed ranges enabled:
> [ 0.000000] 00000-9FFFF write-back
> [ 0.000000] A0000-BFFFF uncachable
> [ 0.000000] C0000-FFFFF write-protect
> [ 0.000000] MTRR variable ranges enabled:
> [ 0.000000] 0 base 0080000000 mask FF80000000 uncachable
> [ 0.000000] 1 disabled
> [ 0.000000] 2 disabled
> [ 0.000000] 3 disabled
> [ 0.000000] 4 disabled
> [ 0.000000] 5 disabled
> [ 0.000000] 6 disabled
> [ 0.000000] 7 disabled
> [ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
> [ 0.000000] found SMP MP-table at [mem 0x000f6bd0-0x000f6bdf] mapped at [ (ptrval)]
> [ 0.000000] Base memory trampoline at [ (ptrval)] 99000 size 24576
> [ 0.000000] Using GB pages for direct mapping
> [ 0.000000] BRK [0x2c984000, 0x2c984fff] PGTABLE
> [ 0.000000] BRK [0x2c985000, 0x2c985fff] PGTABLE
> [ 0.000000] BRK [0x2c986000, 0x2c986fff] PGTABLE
> [ 0.000000] BRK [0x2c987000, 0x2c987fff] PGTABLE
> [ 0.000000] BRK [0x2c988000, 0x2c988fff] PGTABLE
> [ 0.000000] BRK [0x2c989000, 0x2c989fff] PGTABLE
> [ 0.000000] ACPI: Early table checksum verification disabled
> [ 0.000000] ACPI: RSDP 0x00000000000F69C0 000014 (v00 BOCHS )
> [ 0.000000] ACPI: RSDT 0x000000007FFE12FF 00002C (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001)
> [ 0.000000] ACPI: FACP 0x000000007FFE120B 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001)
> [ 0.000000] ACPI: DSDT 0x000000007FFE0040 0011CB (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001)
> [ 0.000000] ACPI: FACS 0x000000007FFE0000 000040
> [ 0.000000] ACPI: APIC 0x000000007FFE127F 000080 (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001)
> [ 0.000000] ACPI: Local APIC address 0xfee00000
> [ 0.000000] No NUMA configuration found
> [ 0.000000] Faking a node at [mem 0x0000000000000000-0x000000007ffd8fff]
> [ 0.000000] NODE_DATA(0) allocated [mem 0x7ffc2000-0x7ffd8fff]
> [ 0.000000] kvm-clock: cpu 0, msr 0:7ffc0001, primary cpu clock
> [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> [ 0.000000] kvm-clock: using sched offset of 1076013277 cycles
> [ 0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [ 0.000000] Zone ranges:
> [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff]
> [ 0.000000] DMA32 [mem 0x0000000001000000-0x000000007ffd8fff]
> [ 0.000000] Normal empty
> [ 0.000000] Device empty
> [ 0.000000] Movable zone start for each node
> [ 0.000000] Early memory node ranges
> [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009efff]
> [ 0.000000] node 0: [mem 0x0000000000100000-0x000000007ffd8fff]
> [ 0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000007ffd8fff]
> [ 0.000000] On node 0 totalpages: 524151
> [ 0.000000] DMA zone: 64 pages used for memmap
> [ 0.000000] DMA zone: 21 pages reserved
> [ 0.000000] DMA zone: 3998 pages, LIFO batch:0
> [ 0.000000] DMA32 zone: 8128 pages used for memmap
> [ 0.000000] DMA32 zone: 520153 pages, LIFO batch:31
> [ 0.000000] Reserved but unavailable: 98 pages
> [ 0.000000] ACPI: PM-Timer IO Port: 0x608
> [ 0.000000] ACPI: Local APIC address 0xfee00000
> [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
> [ 0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
> [ 0.000000] ACPI: IRQ0 used by override.
> [ 0.000000] ACPI: IRQ5 used by override.
> [ 0.000000] ACPI: IRQ9 used by override.
> [ 0.000000] ACPI: IRQ10 used by override.
> [ 0.000000] ACPI: IRQ11 used by override.
> [ 0.000000] Using ACPI (MADT) for SMP configuration information
> [ 0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
> [ 0.000000] PM: Registered nosave memory: [mem 0x00000000-0x00000fff]
> [ 0.000000] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
> [ 0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000effff]
> [ 0.000000] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff]
> [ 0.000000] e820: [mem 0x80000000-0xfeffbfff] available for PCI devices
> [ 0.000000] Booting paravirtualized kernel on KVM
> [ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
> [ 0.000000] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:2 nr_node_ids:1
> [ 0.000000] percpu: Embedded 44 pages/cpu @ (ptrval) s142296 r8192 d29736 u1048576
> [ 0.000000] pcpu-alloc: s142296 r8192 d29736 u1048576 alloc=1*2097152
> [ 0.000000] pcpu-alloc: [0] 0 1
> [ 0.000000] KVM setup async PF for cpu 0
> [ 0.000000] kvm-stealtime: cpu 0, msr 7fc122c0
> [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 515938
> [ 0.000000] Policy zone: DMA32
> [ 0.000000] Kernel command line: root=/dev/vda2 ro debug console=ttyS0,115200 LANG=en_US.UTF-8 slub_debug=FZP raid=noautodetect selinux=0
> [ 0.000000] Memory: 2037056K/2096604K available (12300K kernel code, 1554K rwdata, 3584K rodata, 1640K init, 912K bss, 59548K reserved, 0K cma-reserved)
> [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> [ 0.000000] ftrace: allocating 36554 entries in 143 pages
> [ 0.001000] Hierarchical RCU implementation.
> [ 0.001000] RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=2.
> [ 0.001000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
> [ 0.001000] NR_IRQS: 4352, nr_irqs: 440, preallocated irqs: 16
> [ 0.001000] Offload RCU callbacks from CPUs: (none).
> [ 0.001000] Console: colour dummy device 80x25
> [ 0.001000] console [ttyS0] enabled
> [ 0.001000] ACPI: Core revision 20171110
> [ 0.001000] ACPI: 1 ACPI AML tables successfully acquired and loaded
> [ 0.001009] APIC: Switch to symmetric I/O mode setup
> [ 0.001571] x2apic enabled
> [ 0.002003] Switched APIC routing to physical x2apic.
> [ 0.003538] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> [ 0.004000] tsc: Detected 2496.000 MHz processor
> [ 0.004014] Calibrating delay loop (skipped) preset value.. 4992.00 BogoMIPS (lpj=2496000)
> [ 0.005014] pid_max: default: 32768 minimum: 301
> [ 0.006057] Security Framework initialized
> [ 0.006548] Yama: becoming mindful.
> [ 0.007019] SELinux: Disabled at boot.
> [ 0.008206] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
> [ 0.009164] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
> [ 0.009816] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
> [ 0.010009] Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes)
> [ 0.011322] mce: CPU supports 10 MCE banks
> [ 0.011740] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
> [ 0.012002] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
> [ 0.012610] Freeing SMP alternatives memory: 36K
> [ 0.013467] TSC deadline timer enabled
> [ 0.013820] smpboot: CPU0: Intel Core Processor (Skylake) (family: 0x6, model: 0x5e, stepping: 0x3)
> [ 0.014000] Performance Events: unsupported p6 CPU model 94 no PMU driver, software events only.
> [ 0.014041] Hierarchical SRCU implementation.
> [ 0.015133] NMI watchdog: Perf event create on CPU 0 failed with -2
> [ 0.015725] NMI watchdog: Perf NMI watchdog permanently disabled
> [ 0.016077] smp: Bringing up secondary CPUs ...
> [ 0.016654] x86: Booting SMP configuration:
> [ 0.017005] .... node #0, CPUs: #1
> [ 0.001000] kvm-clock: cpu 1, msr 0:7ffc0041, secondary cpu clock
> [ 0.019051] KVM setup async PF for cpu 1
> [ 0.019599] kvm-stealtime: cpu 1, msr 7fd122c0
> [ 0.020009] smp: Brought up 1 node, 2 CPUs
> [ 0.020531] smpboot: Max logical packages: 2
> [ 0.021009] smpboot: Total of 2 processors activated (9984.00 BogoMIPS)
> [ 0.023160] devtmpfs: initialized
> [ 0.023513] x86/mm: Memory block size: 128MB
> [ 0.024811] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
> [ 0.025015] futex hash table entries: 512 (order: 3, 32768 bytes)
> [ 0.026185] RTC time: 0:42:06, date: 12/16/17
> [ 0.026790] NET: Registered protocol family 16
> [ 0.027204] audit: initializing netlink subsys (disabled)
> [ 0.027914] audit: type=2000 audit(1513384927.133:1): state=initialized audit_enabled=0 res=1
> [ 0.028185] cpuidle: using governor menu
> [ 0.029118] ACPI: bus type PCI registered
> [ 0.029872] PCI: Using configuration type 1 for base access
> [ 0.034355] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
> [ 0.035011] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
> [ 0.036066] cryptd: max_cpu_qlen set to 1000
> [ 0.036579] ACPI: Added _OSI(Module Device)
> [ 0.037007] ACPI: Added _OSI(Processor Device)
> [ 0.037426] ACPI: Added _OSI(3.0 _SCP Extensions)
> [ 0.037857] ACPI: Added _OSI(Processor Aggregator Device)
> [ 0.041356] ACPI: Interpreter enabled
> [ 0.041764] ACPI: (supports S0 S5)
> [ 0.042005] ACPI: Using IOAPIC for interrupt routing
> [ 0.042655] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
> [ 0.043625] ACPI: Enabled 2 GPEs in block 00 to 0F
> [ 0.059248] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
> [ 0.059953] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI]
> [ 0.060045] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
> [ 0.061180] PCI host bridge to bus 0000:00
> [ 0.061874] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window]
> [ 0.062013] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window]
> [ 0.063016] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
> [ 0.064015] pci_bus 0000:00: root bus resource [mem 0x80000000-0xfebfffff window]
> [ 0.065014] pci_bus 0000:00: root bus resource [bus 00-ff]
> [ 0.065753] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000
> [ 0.066487] pci 0000:00:01.0: [8086:7000] type 00 class 0x060100
> [ 0.067537] pci 0000:00:01.1: [8086:7010] type 00 class 0x010180
> [ 0.071700] pci 0000:00:01.1: reg 0x20: [io 0xc100-0xc10f]
> [ 0.074032] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io 0x01f0-0x01f7]
> [ 0.074908] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io 0x03f6]
> [ 0.075011] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io 0x0170-0x0177]
> [ 0.076010] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io 0x0376]
> [ 0.077121] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000
> [ 0.078148] pci 0000:00:01.3: quirk: [io 0x0600-0x063f] claimed by PIIX4 ACPI
> [ 0.079014] pci 0000:00:01.3: quirk: [io 0x0700-0x070f] claimed by PIIX4 SMB
> [ 0.080224] pci 0000:00:03.0: [1af4:1000] type 00 class 0x020000
> [ 0.082007] pci 0000:00:03.0: reg 0x10: [io 0xc040-0xc05f]
> [ 0.083814] pci 0000:00:03.0: reg 0x14: [mem 0xfebc0000-0xfebc0fff]
> [ 0.089891] pci 0000:00:03.0: reg 0x30: [mem 0xfeb80000-0xfebbffff pref]
> [ 0.090545] pci 0000:00:05.0: [1af4:1003] type 00 class 0x078000
> [ 0.092708] pci 0000:00:05.0: reg 0x10: [io 0xc060-0xc07f]
> [ 0.094009] pci 0000:00:05.0: reg 0x14: [mem 0xfebc1000-0xfebc1fff]
> [ 0.102484] pci 0000:00:06.0: [8086:2934] type 00 class 0x0c0300
> [ 0.108028] pci 0000:00:06.0: reg 0x20: [io 0xc080-0xc09f]
> [ 0.110738] pci 0000:00:06.1: [8086:2935] type 00 class 0x0c0300
> [ 0.114388] pci 0000:00:06.1: reg 0x20: [io 0xc0a0-0xc0bf]
> [ 0.117339] pci 0000:00:06.2: [8086:2936] type 00 class 0x0c0300
> [ 0.122770] pci 0000:00:06.2: reg 0x20: [io 0xc0c0-0xc0df]
> [ 0.124738] pci 0000:00:06.7: [8086:293a] type 00 class 0x0c0320
> [ 0.125825] pci 0000:00:06.7: reg 0x10: [mem 0xfebc2000-0xfebc2fff]
> [ 0.130347] pci 0000:00:07.0: [1af4:1001] type 00 class 0x010000
> [ 0.133007] pci 0000:00:07.0: reg 0x10: [io 0xc000-0xc03f]
> [ 0.134793] pci 0000:00:07.0: reg 0x14: [mem 0xfebc3000-0xfebc3fff]
> [ 0.141808] pci 0000:00:08.0: [1af4:1002] type 00 class 0x00ff00
> [ 0.142914] pci 0000:00:08.0: reg 0x10: [io 0xc0e0-0xc0ff]
> [ 0.148977] ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11)
> [ 0.149455] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
> [ 0.150390] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
> [ 0.151382] ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11)
> [ 0.152380] ACPI: PCI Interrupt Link [LNKS] (IRQs *9)
> [ 0.154508] vgaarb: loaded
> [ 0.155271] SCSI subsystem initialized
> [ 0.155887] EDAC MC: Ver: 3.0.0
> [ 0.156255] PCI: Using ACPI for IRQ routing
> [ 0.156566] PCI: pci_cache_line_size set to 64 bytes
> [ 0.157161] e820: reserve RAM buffer [mem 0x0009fc00-0x0009ffff]
> [ 0.157914] e820: reserve RAM buffer [mem 0x7ffd9000-0x7fffffff]
> [ 0.158253] NetLabel: Initializing
> [ 0.158765] NetLabel: domain hash size = 128
> [ 0.159005] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
> [ 0.159775] NetLabel: unlabeled traffic allowed by default
> [ 0.160073] clocksource: Switched to clocksource kvm-clock
> [ 0.186764] VFS: Disk quotas dquot_6.6.0
> [ 0.187277] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
> [ 0.188251] FS-Cache: Loaded
> [ 0.188725] pnp: PnP ACPI init
> [ 0.189271] pnp 00:00: Plug and Play ACPI device, IDs PNP0b00 (active)
> [ 0.190231] pnp 00:01: Plug and Play ACPI device, IDs PNP0303 (active)
> [ 0.191229] pnp 00:02: Plug and Play ACPI device, IDs PNP0f13 (active)
> [ 0.192096] pnp 00:03: [dma 2]
> [ 0.192514] pnp 00:03: Plug and Play ACPI device, IDs PNP0700 (active)
> [ 0.193631] pnp 00:04: Plug and Play ACPI device, IDs PNP0501 (active)
> [ 0.195416] pnp: PnP ACPI: found 5 devices
> [ 0.206055] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
> [ 0.207127] pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window]
> [ 0.207832] pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window]
> [ 0.208594] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
> [ 0.209469] pci_bus 0000:00: resource 7 [mem 0x80000000-0xfebfffff window]
> [ 0.210493] NET: Registered protocol family 2
> [ 0.211244] tcp_listen_portaddr_hash hash table entries: 1024 (order: 2, 16384 bytes)
> [ 0.212283] TCP established hash table entries: 16384 (order: 5, 131072 bytes)
> [ 0.213285] TCP bind hash table entries: 16384 (order: 6, 262144 bytes)
> [ 0.214306] TCP: Hash tables configured (established 16384 bind 16384)
> [ 0.215065] UDP hash table entries: 1024 (order: 3, 32768 bytes)
> [ 0.215797] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes)
> [ 0.217934] NET: Registered protocol family 1
> [ 0.219126] RPC: Registered named UNIX socket transport module.
> [ 0.219676] RPC: Registered udp transport module.
> [ 0.220130] RPC: Registered tcp transport module.
> [ 0.220552] RPC: Registered tcp NFSv4.1 backchannel transport module.
> [ 0.221153] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
> [ 0.221701] pci 0000:00:01.0: PIIX3: Enabling Passive Release
> [ 0.222319] pci 0000:00:01.0: Activating ISA DMA hang workarounds
> [ 0.444214] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
> [ 0.880141] ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11
> [ 1.311493] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11
> [ 1.748829] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
> [ 1.962124] PCI: CLS 0 bytes, default 64
> [ 1.964749] Initialise system trusted keyrings
> [ 1.965289] workingset: timestamp_bits=37 max_order=19 bucket_order=0
> [ 1.969600] zbud: loaded
> [ 1.971287] SGI XFS with security attributes, no debug enabled
> [ 2.106071] NET: Registered protocol family 38
> [ 2.106556] Key type asymmetric registered
> [ 2.106931] Asymmetric key parser 'x509' registered
> [ 2.107514] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248)
> [ 2.108327] io scheduler noop registered
> [ 2.108813] io scheduler deadline registered
> [ 2.109608] io scheduler cfq registered (default)
> [ 2.110258] io scheduler mq-deadline registered
> [ 2.110796] io scheduler kyber registered
> [ 2.111688] intel_idle: Please enable MWAIT in BIOS SETUP
> [ 2.112310] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
> [ 2.113037] ACPI: Power Button [PWRF]
> [ 2.331642] virtio-pci 0000:00:03.0: virtio_pci: leaving for legacy driver
> [ 2.554093] virtio-pci 0000:00:05.0: virtio_pci: leaving for legacy driver
> [ 2.775938] virtio-pci 0000:00:07.0: virtio_pci: leaving for legacy driver
> [ 2.975053] tsc: Refined TSC clocksource calibration: 2495.981 MHz
> [ 2.975641] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23fa6529869, max_idle_ns: 440795218057 ns
> [ 3.029409] virtio-pci 0000:00:08.0: virtio_pci: leaving for legacy driver
> [ 3.032925] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> [ 3.056849] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
> [ 3.064748] Non-volatile memory driver v1.3
> [ 3.065925] ppdev: user-space parallel port driver
> [ 3.071816] loop: module loaded
> [ 3.075337] vda: vda1 vda2 vda3
> [ 3.076659] Rounding down aligned max_sectors from 4294967295 to 4294967288
> [ 3.077996] Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
> [ 3.079790] libphy: Fixed MDIO Bus: probed
> [ 3.080257] tun: Universal TUN/TAP device driver, 1.6
> [ 3.082222] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
> [ 3.083675] serio: i8042 KBD port at 0x60,0x64 irq 1
> [ 3.084160] serio: i8042 AUX port at 0x60,0x64 irq 12
> [ 3.084816] mousedev: PS/2 mouse device common for all mice
> [ 3.086603] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1
> [ 3.089192] rtc_cmos 00:00: RTC can wake from S4
> [ 3.090116] rtc_cmos 00:00: rtc core: registered rtc_cmos as rtc0
> [ 3.092829] rtc_cmos 00:00: alarms up to one day, y3k, 114 bytes nvram
> [ 3.093937] IR NEC protocol handler initialized
> [ 3.094408] IR RC5(x/sz) protocol handler initialized
> [ 3.094901] IR RC6 protocol handler initialized
> [ 3.095510] IR JVC protocol handler initialized
> [ 3.095952] IR Sony protocol handler initialized
> [ 3.096399] IR SANYO protocol handler initialized
> [ 3.096862] IR Sharp protocol handler initialized
> [ 3.097342] IR MCE Keyboard/mouse protocol handler initialized
> [ 3.097919] IR XMP protocol handler initialized
> [ 3.098530] device-mapper: uevent: version 1.0.3
> [ 3.099209] device-mapper: ioctl: 4.37.0-ioctl (2017-09-20) initialised: dm-devel@xxxxxxxxxx
> [ 3.100398] device-mapper: multipath round-robin: version 1.2.0 loaded
> [ 3.101883] drop_monitor: Initializing network drop monitor service
> [ 3.102553] Netfilter messages via NETLINK v0.30.
> [ 3.103090] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
> [ 3.103738] ctnetlink v0.93: registering with nfnetlink.
> [ 3.104494] ip_tables: (C) 2000-2006 Netfilter Core Team
> [ 3.105734] Initializing XFRM netlink socket
> [ 3.106885] NET: Registered protocol family 10
> [ 3.109341] Segment Routing with IPv6
> [ 3.109976] mip6: Mobile IPv6
> [ 3.111987] ip6_tables: (C) 2000-2006 Netfilter Core Team
> [ 3.114230] NET: Registered protocol family 17
> [ 3.115047] Bridge firewalling registered
> [ 3.115824] Ebtables v2.0 registered
> [ 3.117996] 8021q: 802.1Q VLAN Support v1.8
> [ 3.119429] AVX2 version of gcm_enc/dec engaged.
> [ 3.119886] AES CTR mode by8 optimization enabled
> [ 3.128818] sched_clock: Marking stable (3128714579, 0)->(3404180881, -275466302)
> [ 3.129945] registered taskstats version 1
> [ 3.130427] Loading compiled-in X.509 certificates
> [ 3.163216] Loaded X.509 cert 'Build time autogenerated kernel key: 38e0adea1af8bd8a23b02436d4acf2f8c7408d23'
> [ 3.166359] zswap: loaded using pool lzo/zbud
> [ 3.167943] Key type big_key registered
> [ 3.168778] Magic number: 13:918:708
> [ 3.169255] rtc_cmos 00:00: setting system clock to 2017-12-16 00:42:09 UTC (1513384929)
> [ 3.170604] md: Skipping autodetection of RAID arrays. (raid=autodetect will force)
> [ 3.171932] EXT4-fs (vda2): couldn't mount as ext3 due to feature incompatibilities
> [ 3.173871] EXT4-fs (vda2): couldn't mount as ext2 due to feature incompatibilities
> [ 3.175306] EXT4-fs (vda2): INFO: recovery required on readonly filesystem
> [ 3.176212] EXT4-fs (vda2): write access will be enabled during recovery
> [ 3.397187] EXT4-fs (vda2): orphan cleanup on readonly fs
> [ 3.399412] EXT4-fs (vda2): 5 orphan inodes deleted
> [ 3.402759] EXT4-fs (vda2): recovery complete
> [ 3.466647] EXT4-fs (vda2): mounted filesystem with ordered data mode. Opts: (null)
> [ 3.469401] VFS: Mounted root (ext4 filesystem) readonly on device 253:2.
> [ 3.473719] devtmpfs: mounted
> [ 3.492549] Freeing unused kernel memory: 1640K
> [ 3.494547] Write protecting the kernel read-only data: 18432k
> [ 3.498781] Freeing unused kernel memory: 2016K
> [ 3.503330] Freeing unused kernel memory: 512K
> [ 3.505232] rodata_test: all tests were successful
> [ 3.515355] 1 (init): Uhuuh, elf segement at 00000000928fda3e requested but the memory is mapped already
> [ 3.519533] Starting init: /sbin/init exists but couldn't execute it (error -95)
> [ 3.528993] Starting init: /bin/sh exists but couldn't execute it (error -14)
> [ 3.532127] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
> [ 3.538328] CPU: 0 PID: 1 Comm: init Not tainted 4.15.0-rc3-next-20171215-00001-g6d6aea478fce #11
> [ 3.542201] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
> [ 3.546081] Call Trace:
> [ 3.547221] dump_stack+0x5c/0x79
> [ 3.548768] ? rest_init+0x30/0xb0
> [ 3.550320] panic+0xe4/0x232
> [ 3.551669] ? rest_init+0xb0/0xb0
> [ 3.553110] kernel_init+0xeb/0x100
> [ 3.554701] ret_from_fork+0x1f/0x30
> [ 3.558964] Kernel Offset: 0x2000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [ 3.564160] ---[ end Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
--
Michal Hocko
SUSE Labs