Re: Regression: 4.5-rc1 (bisect: hugetlb: make mm and fs code explicitly non-modular vs CONFIG_TIMER_STATS)
From: Paul Gortmaker
Date: Thu Jan 28 2016 - 18:02:36 EST
[Re: Regression: 4.5-rc1 (bisect: hugetlb: make mm and fs code explicitly non-modular vs CONFIG_TIMER_STATS)] On 28/01/2016 (Thu 14:18) Mike Kravetz wrote:
> On 01/28/2016 07:05 AM, Mike Kravetz wrote:
> > On 01/28/2016 06:37 AM, Paul Gortmaker wrote:
> >> [Re: Regression: 4.5-rc1 (bisect: hugetlb: make mm and fs code explicitly non-modular vs CONFIG_TIMER_STATS)] On 28/01/2016 (Thu 10:48) Christian Borntraeger wrote:
> >>
> >>> On 01/28/2016 10:40 AM, Hillf Danton wrote:
> >>>>>
> >>>>> Paul,
> >>>>>
> >>>>> the commit 3e89e1c5ea842 ("hugetlb: make mm and fs code explicitly non-modular")
> >>>>> triggers belows warning/oops, if CONFIG_TIMER_STATS is set.
> >>>>>
> >>>>> Looking at the patch the only "real" change is the init_call,
> >>>>> and indeed
> >>>>> --- a/mm/hugetlb.c
> >>>>> +++ b/mm/hugetlb.c
> >>>>> @@ -2653,7 +2653,7 @@ static int __init hugetlb_init(void)
> >>>>> mutex_init(&hugetlb_fault_mutex_table[i]);
> >>>>> return 0;
> >>>>> }
> >>>>> -subsys_initcall(hugetlb_init);
> >>>>> +device_initcall(hugetlb_init);
> >>>>>
> >>>>> /* Should be called on processing a hugepagesz=... option */
> >>>>> void __init hugetlb_add_hstate(unsigned int order)
> >>>>>
> >>>>> makes the problem go away.
> >>>>
> >>>> Helps more if a patch is delivered.
> >>>
> >>> The problem is that the original change was intentional. So I do not not
> >>> what the right fix is.
> >>
> >> Thanks for the report ; let me see if I can work out what TIMER_STATS
> >> is doing to cause this sometime today.
> >>
> >
> > Hmmm? CONFIG_TIMER_STATS is set in my config and I am not seeing the
> > issue. Not sure, but it looks like Christian is building/running on
> > s390. This 'might' be a contributing factor.
>
> I do not see how CONFIG_TIMER_STATS contributes to this issue. However,
I looked at all the TIMER_STATS ifdef blocks and was also thinking the
same thing. If it did toggle the problem then it was a red herring.
My test config had this set and I retested x86-64 today with it set.
> on s390 numa nodes are initialized at device_initcall in the appropriately
> named routine numa_init_late(). hugetlb_init must be done after numa
> initialization. So, I suggest we just move the hugetlb initialization
> back to device_initcall. What do you think Paul? Patch below.
We could, but that ignores the fact that the original priorities worked
by chance and not by design, as my commit log indicates. Instead, I'd
like to know why S390 does core NUMA operations as late as
device_initcall. Setting up NUMA nodes should be arch_initcall or
subsys_initcall, or earlier --- it should not be device_initcall as if
it was some leaf node UART driver or ethernet driver. There is no
endpoint "device" in NUMA in this context.
If possible, I'd rather us look at the s390 ordering vs. doing a partial
revert of what is IMHO a sensible change.
Can I do qemu s390 defconfig boots to see this locally, or is that
simply unobtanium? I'd like to resolve this since I caused it, but I'd
also like to avoid making a knee-jerk revert to do so.
Thanks,
P.
--
>
> Is there a way to add checks for this type of thing in the code? I'm
> guessing not.
>
> --
> Mike Kravetz
>
> hugetlb: move hugetlb_init and init_hugetlbfs_fs to device_initcall level
>
> hugetlb_init() must be called after numa initialization for all
> architectures. Currently, numa initialization happens as late as
> device_initcall. Therefore, move hugetlb_init to device_initcall
> level. init_hugetlbfs_fs() depends on hugetlb_init(), so move it
> to device_initcall as well.
>
> Fixes: 3e89e1c5ea ("hugetlb: make mm and fs code explicitly non-modular")
> Reported-by: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
> ---
> fs/hugetlbfs/inode.c | 3 ++-
> mm/hugetlb.c | 6 +++++-
> 2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 540ddc9..d27d3f6 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1363,4 +1363,5 @@ static int __init init_hugetlbfs_fs(void)
> out2:
> return error;
> }
> -fs_initcall(init_hugetlbfs_fs)
> +/* Must happen after hugetlb_init() */
> +device_initcall(init_hugetlbfs_fs)
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 12908dc..a4c0015 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2653,7 +2653,11 @@ static int __init hugetlb_init(void)
> mutex_init(&hugetlb_fault_mutex_table[i]);
> return 0;
> }
> -subsys_initcall(hugetlb_init);
> +/*
> + * hugetlb_init must be called after numa initialization for all
> architectures.
> + * Currently, this is as late as device_initcall().
> + */
> +device_initcall(hugetlb_init);
>
> /* Should be called on processing a hugepagesz=... option */
> void __init hugetlb_add_hstate(unsigned int order)
> --
> 2.4.3
>