Re: [PATCH v2] media: uvcvideo: Do not alloc dev->status
From: Ricardo Ribalda
Date: Tue Dec 20 2022 - 18:00:27 EST
Hi Jonathan
On Thu, 15 Dec 2022 at 12:45, Jonathan Cameron
<Jonathan.Cameron@xxxxxxxxxx> wrote:
>
> On Thu, 15 Dec 2022 11:11:40 +0200
> Laurent Pinchart <laurent.pinchart@xxxxxxxxxxxxxxxx> wrote:
>
> > Hi Ricardo,
> >
> > On Thu, Dec 15, 2022 at 11:08:05AM +0200, Laurent Pinchart wrote:
> > > On Thu, Dec 15, 2022 at 08:59:14AM +0100, Ricardo Ribalda wrote:
> > > > On Thu, 15 Dec 2022 at 02:15, Sergey Senozhatsky wrote:
> > > > >
> > > > > On (22/12/14 14:37), Ricardo Ribalda wrote:
> > > > > [..]
> > > > > > +struct uvc_status_streaming {
> > > > > > + u8 button;
> > > > > > +} __packed;
> > > > > > +
> > > > > > +struct uvc_status_control {
> > > > > > + u8 bSelector;
> > > > > > + u8 bAttribute;
> > > > > > + u8 bValue[11];
> > > > > > +} __packed;
> > > > > > +
> > > > > > +struct uvc_status {
> > > > > > + u8 bStatusType;
> > > > > > + u8 bOriginator;
> > > > > > + u8 bEvent;
> > > > > > + union {
> > > > > > + struct uvc_status_control control;
> > > > > > + struct uvc_status_streaming streaming;
> > > > > > + };
> > > > > > +} __packed;
> > > > > > +
> > > > > > struct uvc_device {
> > > > > > struct usb_device *udev;
> > > > > > struct usb_interface *intf;
> > > > > > @@ -559,7 +579,7 @@ struct uvc_device {
> > > > > > /* Status Interrupt Endpoint */
> > > > > > struct usb_host_endpoint *int_ep;
> > > > > > struct urb *int_urb;
> > > > > > - u8 *status;
> > > > > > +
> > > > > > struct input_dev *input;
> > > > > > char input_phys[64];
> > > > > >
> > > > > > @@ -572,6 +592,12 @@ struct uvc_device {
> > > > > > } async_ctrl;
> > > > > >
> > > > > > struct uvc_entity *gpio_unit;
> > > > > > +
> > > > > > + /*
> > > > > > + * Ensure that status is aligned, making it safe to use with
> > > > > > + * non-coherent DMA.
> > > > > > + */
> > > > > > + struct uvc_status status __aligned(ARCH_KMALLOC_MINALIGN);
> > > > >
> > > > > ____cacheline_aligned ?
> > > > >
> > > > > I don't see anyone using ARCH_KMALLOC_MINALIGN except for slab.h
> > > >
> > > > Seems like cacheline is not good enough:
> > > >
> > > > https://github.com/torvalds/linux/commit/12c4efe3509b8018e76ea3ebda8227cb53bf5887
> > > > https://lore.kernel.org/all/20220405135758.774016-1-catalin.marinas@xxxxxxx/
> > > >
> > > > and ARCH_KMALLOC_MINALIGN is what we have today and is working...
> > > >
> > > > But yeah, the name for that define is not the nicest :)
> > > >
> > > > I added Jonathan Cameron, on cc, as he had to deal with something
> > > > similar for iio in case we are missing something
> > >
> > > I'd like to get feedback on this from DMA and USB experts. Expanding the
> > > CC list of the original patch would help (especially including the
> > > linux-usb mailing list).
> >
> > Also, do we need the allocation change ? It doesn't seem to simplify the
> > code that much, neither in terms of lines of code
> >
> > > 2 files changed, 48 insertions(+), 49 deletions(-)
> >
> > nor in terms of complexity. Maybe we could keep the union and offsetof
> > changes, and drop the allocation change ? In any case, those are two
> > different changes, so I'd split them in two patches at least.
> >
> > > > ps: and I thought this was an easy change :P
> >
> +CC Catalin who is driving effort to change what we should do here to avoid
> wasting space on systems where ARCH_KMALLOC_MINALIGN is currently 128 bytes.
>
> I don't know the precise requirements for this particular allocation, but
> if it's about ensuring the data doesn't share a cacheline with anything else in
> the structure then the problem is that ____cacheline_aligned is the
> size of a line in the L1 cache. It's not uncommon for microarchitectures to have
> a larger cacheline size for L3 and above. Most of the time that doesn't
> matter as they maintain correct coherence (all the ARM servers are fine
> I think - ours has 128 byte cachelines in L3, Fujitsu have parts with
> 256 byte cachelines in L3), but guess what, there are Qualcomm(?) parts where the
> L1 cacheline is 64 bytes, but the l3 cacheline is 128 bytes and don't
> deal with the hardware coherence issues. For those we need to ensure that
> a DMA safe buffer is in it's own 128 byte cacheline, but ___cacheline_aligned
> on arm64 only does 64 bytes. Currently ARCH_KMALLOC_MINALIGN enforces the
> larger guarantee and is available on all architectures unlike
> ARCH_DMA_MINALIGN which is not yet.
>
> Catalin is working to replace this, so the required guarantees may change,
> but we still need something backportable.
>
> When I sent a bunch of fixes for Input Dmitry asked for a general
> ___dma_minalign (naming to be bikeshedded) define. So far there are a few
> subsystems carrying their own local equivalent (IIO moved to
> IIO_DMA_MINALIGN define) in the interests of reducing the pain of
> changing this in future. A central definition is another option.
>
Thanks a lot for the explanation!
> Jonathan
>
>
--
Ricardo Ribalda