Re: [Report] annoyed dma debug warning "cacheline tracking EEXIST, overlapping mappings aren't supported"

From: Ming Lei
Date: Mon Oct 14 2024 - 21:59:33 EST


On Mon, Oct 14, 2024 at 07:09:08PM +0100, Robin Murphy wrote:
> On 14/10/2024 8:58 am, Ming Lei wrote:
> > On Mon, Oct 14, 2024 at 09:41:51AM +0200, Christoph Hellwig wrote:
> > > On Mon, Oct 14, 2024 at 09:23:14AM +0200, Hannes Reinecke wrote:
> > > > > 3) some storage utilities
> > > > > - dm thin provisioning utility of thin_check
> > > > > - `dt`(https://github.com/RobinTMiller/dt)
> > > > >
> > > > > I looks like same user buffer is used in more than 1 dio.
> > > > >
> > > > > 4) some self cooked test code which does same thing with 1)
> > > > >
> > > > > In storage stack, the buffer provider is far away from the actual DMA
> > > > > controller operating code, which doesn't have the knowledge if
> > > > > DMA_ATTR_SKIP_CPU_SYNC should be set.
> > > > >
> > > > > And suggestions for avoiding this noise?
> > > > >
> > > > Can you check if this is the NULL page? Operations like 'discard' will
> > > > create bios with several bvecs all pointing to the same NULL page.
> > > > That would be the most obvious culprit.
> > >
> > > The only case I fully understand without looking into the details
> > > is raid1, and that will obviously map the same data multiple times
> >
> > The other cases should be concurrent DIOs on same userspace buffer.
>
> active_cacheline_insert() does already bail out for DMA_TO_DEVICE, so it
> returning -EEXIST to tickle the warning would seem to genuinely imply these
> are DMA mappings requesting to *write* the same cacheline concurrently,
> which is indeed broken in general.

The two io_uring tests are READ, and the dm thin_check are READ too.

For the raid1 case, the warning is from raid1_sync_request() which may
have both READ/WRITE IO.

Thanks,
Ming