Re: [RFC PATCH v3 0/2] Fix storing in XArray check_split tests
From: Zi Yan
Date: Wed Apr 01 2026 - 10:10:48 EST
On 1 Apr 2026, at 3:32, David Hildenbrand (Arm) wrote:
> On 3/16/26 17:49, Zi Yan wrote:
>> On 16 Mar 2026, at 12:23, David Hildenbrand (Arm) wrote:
>>
>>> On 2/23/26 08:34, Ackerley Tng wrote:
>>>> Hi,
>>>>
>>>> I hit an assertion while making some modifications to
>>>> lib/test_xarray.c [1] and I believe this is the fix.
>>>>
>>>> In check_split, the tests split the XArray node and then store values
>>>> after the split to verify that splitting worked. While storing and
>>>> retrieval works as expected, the node's metadata, specifically
>>>> node->nr_values, is not updated correctly.
>>>>
>>>> This led to the assertion being hit in [1], since the storing process
>>>> did not increment node->nr_values sufficiently, while the erasing
>>>> process assumed the fully-incremented node->nr_values state.
>>>>
>>>> Would like to check my understanding on these:
>>>>
>>>> 1. In the multi-index xarray world, is node->nr_values definitely the
>>>> total number of values *and siblings* in the node?
>>>>
>>>> 2. IIUC xas_store() has significantly different behavior when entry is
>>>> NULL vs non-NULL: when entry is NULL, xas_store() does not make
>>>> assumptions on the number of siblings and erases all the way till
>>>> the next non-sibling entry. This sounds fair to me, but it's also
>>>> kind of surprising that it is differently handled when entry is
>>>> non-NULL, where xas_store() respects xas->xa_sibs.
>>>>
>>>> 3. If xas_store() is dependent on its caller to set up xas correctly
>>>> (also sounds fair), then there are places where xas_store() is
>>>> used, like replace_page_cache_folio() or
>>>> migrate_huge_page_move_mapping(), where xas is set up assuming 0
>>>> order pages. Are those buggy?
>>>
>>> Zi, do you have any familiarity with that code and could help?
>>
>> Not much. But I used lib/test_xarray.c to did a test:
>>
>> 1. initialize an xarray with order 6 and set entry to 0,
>>
>> 2. add a new xas like XA_STATE(xas0, xa, 0);
>> 3. xas_store(&xas0, xa_mk_value(32));
>>
>> 4. add a new xas like XA_STATE(xas0, xa, 16);
>> 5. xas_store(&xas0, xa_mk_value(48));
>>
>> The outcome is that xas_store() does not change xarray structure,
>> namely the orders are preserved. No issue is present.
>>
>> After 2 and 3, the xarray is still order 6, but its 0-63 entries (all order-6)
>> are changed from 0 to 32.
>> After 4 and 5, the xarray is still order 6, but its 0-63 entries
>> are changed from 32 to 48.
>>
>> I will need to dig into the code more to explain how xas_store() works.
>
> Zi,
>
> we discussed this topic with Willy in the THP cabal call. I did not get
> all the details, do you remember our conclusion?
The conclusion is that if user wants to erase (or xas_store(NULL)) an index
that is in the middle of a multi-index entry, they need to split that
multi-index first then do the erase (or xas_store(NULL)). Because it is
documented in xa_erase() (or xas_store(NULL)) that it erases all indices
of a multi-index entry[1] and requiring xa_erase() (or xas_store(NULL))
to split a multi-index entry and erase the specified index only is
too much due to potential memory allocations during multi-index
entry split process.
[1] https://elixir.bootlin.com/linux/v6.19.10/source/lib/xarray.c#L1640
>
> (I can try getting access to the recording)
Best Regards,
Yan, Zi