Re: [PATCH 4/4] hugetlb: add hugepages_node= command-line option
From: Luiz Capitulino
Date: Mon Feb 17 2014 - 08:57:44 EST
On Sat, 15 Feb 2014 02:06:38 -0800 (PST)
David Rientjes <rientjes@xxxxxxxxxx> wrote:
> On Fri, 14 Feb 2014, Luiz Capitulino wrote:
>
> > > Again, I think this syntax is horrendous and doesn't couple well with the
> > > other hugepage-related kernel command line options. We already have
> > > hugepages= and hugepagesz= which you can interleave on the command line to
> > > get 100 2M hugepages and 10 1GB hugepages, for example.
> > >
> > > This patchset is simply introducing another variable to the matter: the
> > > node that the hugepages should be allocated on. So just introduce a
> > > hugepagesnode= parameter to couple with the others so you can do
> > >
> > > hugepagesz=<size> hugepagesnode=<nid> hugepages=<#>
> >
> > That was my first try but it turned out really bad. First, for every node
> > you specify you need three options.
>
> Just like you need two options today to specify a number of hugepages of a
> particular non-default size. You only need to use hugepagesz= or
> hugepagenode= if you want a non-default size or a specify a particular
> node.
>
> > So, if you want to setup memory for
> > three nodes you'll need to specify nine options.
>
> And you currently need six if you want to specify three different hugepage
> sizes (?). But who really specifies three different hugepage sizes on the
> command line that are needed to be reserved at boot?
hugepages= and hugepages_node= are similar, but have different semantics.
hugepagesz= and hugepages= create a pool of huge pages of the specified size.
This means that the number of times you specify those options are limited by
the number of different huge pages sizes an arch supports. For x86_64 for
example, this limit is two so one would not specify those options more than
two times. And this doesn't count default_hugepagesz=, which allows you to
drop one hugepagesz= option.
hugepages_node= allows you to allocate huge pages per node, so the number of
times you can specify this option is limited by the number of nodes. Also,
hugepages_node= create the pools, if necessary (at least one will be). For
this reason I think it makes a lot of sense to have different options.
> If that's really the usecase, it seems like you want the old
> CONFIG_PAGE_SHIFT patch.
>
> > And it gets worse, because
> > hugepagesz= and hugepages= have strict ordering (which is a mistake, IMHO) so
> > you have to specify them in the right order otherwise things don't work as
> > expected and you have no idea why (have been there myself).
> >
>
> How is that difficult? hugepages= is the "noun", hugepagesz= is the
> "adjective". hugepages=100 hugepagesz=1G hugepages=4 makes perfect sense
> to me, and I actually don't allocate hugepages on the command line, nor
> have I looked at Documentation/kernel-parameters.txt to check if I'm
> constructing it correctly. It just makes sense and once you learn it it's
> just natural.
>
> > IMO, hugepages_node=<nid>:<nr_pages>:<size>,... is good enough. It's concise,
> > and don't depend on any other option to function. Also, there are lots of other
> > kernel command-line options that require you to specify multiple fields, so
> > it's not like hugepages_node= is totally different in that regard.
> >
>
> I doubt Andrew is going to want a completely different format for hugepage
> allocations that want to specify a node and have to deal with people who
> say hugepages_node=2:1:1G and constantly have to lookup if it's 2
> hugepages on node 1 or 1 hugepage on node 2.
Andrew?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/