Re: XArray documentation

From: Matthew Wilcox
Date: Fri Nov 24 2017 - 12:17:19 EST


On Thu, Nov 23, 2017 at 09:30:21PM -0700, Andreas Dilger wrote:
> > Pointers to be stored in the XArray must have the bottom two bits clear
> > (ie must point to something which is 4-byte aligned). This includes all
> > objects allocated by calling :c:func:`kmalloc` and :c:func:`alloc_page`,
> > but you cannot store pointers to arbitrary offsets within an object.
> > The XArray does not support storing :c:func:`IS_ERR` pointers; some
> > conflict with data values and others conflict with entries the XArray
> > uses for its own purposes. If you need to store special values which
> > cannot be confused with real kernel pointers, the values 4, 8, ... 4092
> > are available.
>
> Thought - if storing error values into the XArray in addition to regular
> pointers is important for some use case, it would be easy to make
> "ERR_PTR_XA()", "PTR_ERR_XA()", and "IS_ERR_XA()" macros that just shift
> the error values up and down by two bits to avoid the conflict. That
> would still allow error values up (down) to -1023 to be stored without
> any chance of a pointer conflict, which should be enough.

We could do something along those lines. It's not something anybody's
trying to do with the radix tree today, but I wanted to document it
just in case anybody got a clever idea in the future. Another source
of ambiguity that I didn't mention is that xa_store() can return
ERR_PTR(-ENOMEM), so if we encode the xarray-error-pointers in some
way, they have to be distinguishable from errors returned by the xa_*
functions. There's no ambiguity when using the xas_ functions as they
signal errors in the xa_state.

> > Each non-NULL entry in the array has three bits associated with it called
> > tags. Each tag may be flipped on or off independently of the others.
> > You can search for entries with a given tag set.
>
> How can it be 3 tag bits, if the pointers only need to be 4-byte aligned?

The same way the radix tree does it -- there are three bitfields in each
node. So all your PAGECACHE_DIRTY tags are stored in the same bitfield.

(I'm taking these two questions as request for more information on the
implementation, which I don't want to put in this document. I want
this to be the users manual. I think the appropriate place to put this
information is in source code comments.)

> > Finally, you can remove all entries from an XArray by calling
> > :c:func:`xa_destroy`. If the XArray entries are pointers, you may wish
> > to free the entries first. You can do this by iterating over all non-NULL
> > entries in
>
> ... the XArray using xa_for_each() ?

I swear I wrote that ... editing error. New paragraph:

Finally, you can remove all entries from an XArray by calling
:c:func:`xa_destroy`. If the XArray entries are pointers, you may wish
to free the entries first. You can do this by iterating over all non-NULL
entries in the XArray using the :c:func:`xa_for_each` iterator.

> > The only error currently generated by the xarray code itself is
> > ENOMEM, but it supports arbitrary errors in case you want to call
> > :c:func:`xas_set_err` yourself. If the xa_state is holding an ENOMEM
> > error, :c:func:`xas_nomem` will attempt to allocate a single xa_node using
>
> .. calling :c:func:`xas_nomem` ... ?
>
> > the specified gfp flags and store it in the xa_state for the next attempt.
> > The idea is that you take the xa_lock, attempt the operation and drop
> > the lock. Then you allocate memory if there was a memory allocation
>
> ... then you try to allocate ...

Fair point -- if xas_nomem() fails to allocate memory, it'll return 'false'
to indicate not to bother retrying the operation.

> > failure and retry the operation. You must call :c:func:`xas_destroy`
> > if you call :c:func:`xas_nomem` in case it's not necessary to use the
> > memory that was allocated.
>
> This last sentence is not totally clear. How about:
>
> If you called :c:func:`xas_nomem` to allocate memory, but didn't need
> to use the memory for some reason, you need to call :c:func:`xas_destroy`
> to free the allocated memory.
>
>
> The question is where the "allocated memory" is stored, if it isn't in
> the XArray? Is it in the XA_STATE?

That was earlier in the document:

:c:func:`xas_set_err` yourself. If the xa_state is holding an ENOMEM
error, :c:func:`xas_nomem` will attempt to allocate a single xa_node using
the specified gfp flags and store it in the xa_state for the next attempt.

> How do you know if the allocated
> memory was needed, or is it always safe to call xas_destroy?

It's always safe to call xas_destroy.

> Is the
> allocated memory always consumed if xa_store/xa_cmpxchg are called?

Usually. If another process added the same node that we were trying to
store to, we won't allocate memory.

> What if there was another process that also added the same entry while
> the xa_lock was dropped?

We'll overwrite it.

I've rewritten this whole section. Thanks for the feedback!

The xa_state is also used to store errors. You can call
:c:func:`xas_error` to retrieve the error. All operations check whether
the xa_state is in an error state before proceeding, so there's no need
for you to check for an error after each call; you can make multiple
calls in succession and only check at a convenient point. The only error
currently generated by the xarray code itself is ENOMEM, but it supports
arbitrary errors in case you want to call :c:func:`xas_set_err` yourself.

If the xa_state is holding an ENOMEM error, calling :c:func:`xas_nomem`
will attempt to allocate more memory using the specified gfp flags and
cache it in the xa_state for the next attempt. The idea is that you take
the xa_lock, attempt the operation and drop the lock. The operation
attempts to allocate memory while holding the lock, but it is more
likely to fail. Once you have dropped the lock, :c:func:`xas_nomem`
can try harder to allocate more memory. It will return ``true`` if it
is worth retrying the operation (ie that there was a memory error *and*
more memory was allocated. You must call :c:func:`xas_destroy` if you
call :c:func:`xas_nomem` in case memory was allocated and then turned
out not to be needed.

> > When using the advanced API, it's possible to see internal entries
> > in the XArray. You should never see an :c:func:`xa_is_node` entry,
> > but you may see other internal entries, including sibling entries,
> > skip entries and retry entries. The :c:func:`xas_retry` function is a
> > useful helper function for handling internal entries, particularly in
> > the middle of iterations.
>
> How do you know if a returned value is an "internal entry"? Is there
> some "xas_is_internal()" macro/function that tells you this?

I know I chose the right name for that function, since you guessed it :-)

I've rewritten this whole section. I'll post the updated document in a bit.