Re: [PATCH v9 03/10] mm: thp: Introduce multi-size THP sysfs interface

From: Ryan Roberts
Date: Tue Dec 12 2023 - 10:32:37 EST


On 12/12/2023 14:54, David Hildenbrand wrote:
> On 07.12.23 17:12, Ryan Roberts wrote:
>> In preparation for adding support for anonymous multi-size THP,
>> introduce new sysfs structure that will be used to control the new
>> behaviours. A new directory is added under transparent_hugepage for each
>> supported THP size, and contains an `enabled` file, which can be set to
>> "inherit" (to inherit the global setting), "always", "madvise" or
>> "never". For now, the kernel still only supports PMD-sized anonymous
>> THP, so only 1 directory is populated.
>>
>> The first half of the change converts transhuge_vma_suitable() and
>> hugepage_vma_check() so that they take a bitfield of orders for which
>> the user wants to determine support, and the functions filter out all
>> the orders that can't be supported, given the current sysfs
>> configuration and the VMA dimensions. The resulting functions are
>> renamed to thp_vma_suitable_orders() and thp_vma_allowable_orders()
>> respectively. Convenience functions that take a single, unencoded order
>> and return a boolean are also defined as thp_vma_suitable_order() and
>> thp_vma_allowable_order().
>>
>> The second half of the change implements the new sysfs interface. It has
>> been done so that each supported THP size has a `struct thpsize`, which
>> describes the relevant metadata and is itself a kobject. This is pretty
>> minimal for now, but should make it easy to add new per-thpsize files to
>> the interface if needed in future (e.g. per-size defrag). Rather than
>> keep the `enabled` state directly in the struct thpsize, I've elected to
>> directly encode it into huge_anon_orders_[always|madvise|inherit]
>> bitfields since this reduces the amount of work required in
>> thp_vma_allowable_orders() which is called for every page fault.
>>
>> See Documentation/admin-guide/mm/transhuge.rst, as modified by this
>> commit, for details of how the new sysfs interface works.
>>
>> Reviewed-by: Barry Song <v-songbaohua@xxxxxxxx>
>> Tested-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx>
>> Tested-by: John Hubbard <jhubbard@xxxxxxxxxx>
>> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
>> ---
>
> [...]
>
>> +
>> +static ssize_t thpsize_enabled_store(struct kobject *kobj,
>> +                     struct kobj_attribute *attr,
>> +                     const char *buf, size_t count)
>> +{
>> +    int order = to_thpsize(kobj)->order;
>> +    ssize_t ret = count;
>> +
>> +    if (sysfs_streq(buf, "always")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_inherit);
>> +        clear_bit(order, &huge_anon_orders_madvise);
>> +        set_bit(order, &huge_anon_orders_always);
>> +        spin_unlock(&huge_anon_orders_lock);
>> +    } else if (sysfs_streq(buf, "inherit")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_always);
>> +        clear_bit(order, &huge_anon_orders_madvise);
>> +        set_bit(order, &huge_anon_orders_inherit);
>> +        spin_unlock(&huge_anon_orders_lock);
>> +    } else if (sysfs_streq(buf, "madvise")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_always);
>> +        clear_bit(order, &huge_anon_orders_inherit);
>> +        set_bit(order, &huge_anon_orders_madvise);
>> +        spin_unlock(&huge_anon_orders_lock);
>> +    } else if (sysfs_streq(buf, "never")) {
>> +        spin_lock(&huge_anon_orders_lock);
>> +        clear_bit(order, &huge_anon_orders_always);
>> +        clear_bit(order, &huge_anon_orders_inherit);
>> +        clear_bit(order, &huge_anon_orders_madvise);
>> +        spin_unlock(&huge_anon_orders_lock);
>
> Why not perform lock/unlock only once in surrounding code? :)

I was nervous that sysfs_streq() may be unhappy in atomic context... Unfounded?

>
>
> Much better
>
> Acked-by: David Hildenbrand <david@xxxxxxxxxx>
>