Re: [PATCH RFC v2 14/18] dax/region: Support DAX device creation on dynamic DAX regions

From: Ira Weiny
Date: Tue Sep 12 2023 - 18:08:20 EST


Jonathan Cameron wrote:
> On Tue, 5 Sep 2023 21:35:03 -0700
> Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
>
> > Jonathan Cameron wrote:
> > > On Mon, 28 Aug 2023 22:21:05 -0700
> > > Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
> > >
> > > > Dynamic Capacity (DC) DAX regions have a list of extents which define
> > > > the memory of the region which is available.
> > > >
> > > > Now that DAX region extents are fully realized support DAX device
> > > > creation on dynamic regions by adjusting the allocation algorithms
> > > > to account for the extents. Remember also references must be held on
> > > > the extents until the DAX devices are done with the memory.
> > > >
> > > > Redefine the region available size to include only extent space. Reuse
> > > > the size allocation algorithm by defining sub-resources for each extent
> > > > and limiting range allocation to those extents which have space. Do not
> > > > support direct mapping of DAX devices on dynamic devices.
> > > >
> > > > Enhance DAX device range objects to hold references on the extents until
> > > > the DAX device is destroyed.
> > > >
> > > > NOTE: At this time all extents within a region are created equally.
> > > > However, labels are associated with extents which can be used with
> > > > future DAX device labels to group which extents are used.
> > >
> > > This sound like a bad place to start to me as we are enabling something
> > > that is probably 'wrong' in the long term as opposed to just not enabling it
> > > until we have appropriate support.
> >
> > I disagree. I don't think the kernel should be trying to process tags at
> > the lower level.
> >
> > > I'd argue better to just reject any extents with different labels for now.
> >
> > Again I disagree. This is less restrictive. The idea is that labels can
> > be changed such that user space can ultimately decided which extents
> > should be used for which devices. I have some work on that already.
> > (Basically it becomes quite easy to assign a label to a dax device and
> > have the extent search use only dax extents which match that label.)
>
> That sounds good - but if someone expects that and uses it with an old
> kernel I'm not sure if it is better to say 'we don't support it yet' or
> do something different from a newer kernel.

This does provide the 'we don't support that yet' in that dax device
creation can't be associated with a label yet. So surfacing the extents
with the tag as a default label and letting those labels change is more
informational at this point and not functional. Simple use cases can use
the label (from the tag) to detect that some extent with the wrong tag got
in the region but can't correct it without going through the FM.

It is easy enough to remove the label sysfs and defer that until the dax
device has a label and this support though.

>
>
> > > > @@ -1400,8 +1507,10 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data)
> > > > device_initialize(dev);
> > > > dev_set_name(dev, "dax%d.%d", dax_region->id, dev_dax->id);
> > > >
> > > > + dev_WARN_ONCE(parent, is_dynamic(dax_region) && data->size,
> > > > + "Dynamic DAX devices are created initially with 0 size");
> > >
> > > dev_info() maybe more appropriate?
> >
> > Unless I'm mistaken this can happen from userspace but only if something
> > in the code changes later. Because the dax layer is trying to support
> > non-dynamic regions (which dynamic may be a bad name), I was worried that
> > the creation with a size might slip through...
>
> Fair enough - if strong chance userspace will control it at somepoitn then
> ONCE seems fine.
>
> >
> > > Is this common enough that we need the
> > > _ONCE?
> >
> > once is because it could end up spamming a log later if something got
> > coded up wrong.
>
> I'm not sure I care about bugs spamming the log. Only things that
> are userspace controlled or likely hardware failures etc.
>

Understood. Let me trace them again but I think these can be triggered by
user space. If not I'll remove the ONCE.

Thanks again,
Ira