Re: [PATCH -next] arm64/mm: fix a bogus GFP flag in pgd_alloc()

From: Qian Cai
Date: Thu Jun 13 2019 - 11:22:01 EST


On Thu, 2019-06-13 at 15:11 +0300, Mike Rapoport wrote:
> The log Qian Cai posted at [1] and partially cited below confirms that the
> failure happens when *user* PGDs are allocated and the addition of
> __GFP_ACCOUNT to gfp flags used by pgd_alloc() only uncovered another
> issue.
>
> I'm still failing to reproduce it with qemu and I'm not really familiar
> with slub/memcg code to say anything smart about it. Will keep looking.
>
> Note, that as failures start way after efi_virtmap_init() that allocates a
> PGD for efi_mm, there are no real fixes required for the original series,
> except that the check for mm == &init_mm I copied for some reason from
> powerpc is bogus and can be removed.

Yes, there is more places are not happy with __GFP_ACCOUNT other than efi_mm.
For example,

[ÂÂ132.786842][ T1501] kobject_add_internal failed for pgd_cache(49:systemd-
udevd.service) (error: -2 parent: cgroup)
[ÂÂ132.795589][ T1889] CPU: 9 PID: 1889 Comm: systemd-udevd Tainted:
GÂÂÂÂÂÂÂÂWÂÂÂÂÂÂÂÂÂ5.2.0-rc4-next-20190613+ #8
[ÂÂ132.807356][ T1889] Hardware name: HPE Apollo
70ÂÂÂÂÂÂÂÂÂÂÂÂÂ/C01_APACHE_MBÂÂÂÂÂÂÂÂÂ, BIOS L50_5.13_1.0.9 03/01/2019
[ÂÂ132.817872][ T1889] Call trace:
[ÂÂ132.821017][ T1889]ÂÂdump_backtrace+0x0/0x268
[ÂÂ132.825372][ T1889]ÂÂshow_stack+0x20/0x2c
[ÂÂ132.829380][ T1889]ÂÂdump_stack+0xb4/0x108
[ÂÂ132.833475][ T1889]ÂÂpgd_alloc+0x34/0x5c
[ÂÂ132.837396][ T1889]ÂÂmm_init+0x27c/0x32c
[ÂÂ132.841315][ T1889]ÂÂdup_mm+0x84/0x7b4
[ÂÂ132.845061][ T1889]ÂÂcopy_process+0xf20/0x24cc
[ÂÂ132.849500][ T1889]ÂÂ_do_fork+0xa4/0x66c
[ÂÂ132.853420][ T1889]ÂÂ__arm64_sys_clone+0x114/0x1b4
[ÂÂ132.858208][ T1889]ÂÂel0_svc_handler+0x198/0x260
[ÂÂ132.862821][ T1889]ÂÂel0_svc+0x8/0xc

>
> I surely can add pgd_alloc_kernel() to be used by the EFI code to make sure
> we won't run into issues with memcg in the future.
>
> [ÂÂÂ82.125966] Freeing unused kernel memory: 28672K
> [ÂÂÂ87.940365] Checked W+X mappings: passed, no W+X pages found
> [ÂÂÂ87.946769] Run /init as init process
> [ÂÂÂ88.040040] systemd[1]: System time before build time, advancing clock.
> [ÂÂÂ88.054593] systemd[1]: Failed to insert module 'autofs4': No such file or
> directory
> [ÂÂÂ88.374129] modprobe (1726) used greatest stack depth: 28464 bytes left
> [ÂÂÂ88.470108] systemd[1]: systemd 239 running in system mode. (+PAM +AUDIT
> +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT
> +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2
> default-hierarchy=legacy)
> [ÂÂÂ88.498398] systemd[1]: Detected architecture arm64.
> [ÂÂÂ88.506517] systemd[1]: Running in initial RAM disk.
> [ÂÂÂ89.621995] mkdir (1730) used greatest stack depth: 27872 bytes left
> [ÂÂÂ90.222658] random: systemd: uninitialized urandom read (16 bytes read)
> [ÂÂÂ90.230072] systemd[1]: Reached target Swap.
> [ÂÂÂ90.240205] random: systemd: uninitialized urandom read (16 bytes read)
> [ÂÂÂ90.251088] systemd[1]: Reached target Timers.
> [ÂÂÂ90.261303] random: systemd: uninitialized urandom read (16 bytes read)
> [ÂÂÂ90.271209] systemd[1]: Listening on udev Control Socket.
> [ÂÂÂ90.283238] systemd[1]: Reached target Local File Systems.
> [ÂÂÂ90.296232] systemd[1]: Reached target Slices.
> [ÂÂÂ90.307239] systemd[1]: Listening on udev Kernel Socket.
> [ÂÂÂ90.608597] kobject_add_internal failed for pgd_cache(13:init.scope)
> (error: -2 parent: cgroup)
> [ÂÂÂ90.678007] kobject_add_internal failed for pgd_cache(13:init.scope)(error:
> -2 parent: cgroup)
> [ÂÂÂ90.713260] kobject_add_internal failed for pgd_cache(21:systemd-tmpfiles-
> setup.service) (error: -2 parent: cgroup)
> [ÂÂÂ90.820012] systemd-tmpfile (1759) used greatest stack depth: 27184 bytes
> left
> [ÂÂÂ90.861942] kobject_add_internal failed for pgd_cache(13:init.scope) error:
> -2 parent: cgroup)
> Â
> > Thanks,
> > Mark.
> >
>
> [1] https://cailca.github.io/files/dmesg.txt
>