On Thu, Jan 16, 2025 at 04:13:13PM +0100, Christian König wrote:
Am 15.01.25 um 18:09 schrieb Jason Gunthorpe:Dealing with CPU mapping and resource invalidation is a little hard, but is
On Wed, Jan 15, 2025 at 05:34:23PM +0100, Christian König wrote:
Granted, let me try to improve this.
Here is a real world example of one of the issues we ran into and why
CPU mappings of importers are redirected to the exporter.
We have a good bunch of different exporters who track the CPU mappings
of their backing store using address_space objects in one way or
another and then uses unmap_mapping_range() to invalidate those CPU
mappings.
But when importers get the PFNs of the backing store they can look
behind the curtain and directly insert this PFN into the CPU page
tables.
We had literally tons of cases like this where drivers developers cause
access after free issues because the importer created a CPU mappings on
their own without the exporter knowing about it.
This is just one example of what we ran into. Additional to that
basically the whole synchronization between drivers was overhauled as
well because we found that we can't trust importers to always do the
right thing.
But this, fundamentally, is importers creating attachments and then
*ignoring the lifetime rules of DMABUF*. If you created an attachment,
got a move and *ignored the move* because you put the PFN in your own
VMA, then you are not following the attachment lifetime rules!
Move notify is solely for informing the importer that they need to
re-fresh their DMA mappings and eventually block for ongoing DMA to end.
This semantics doesn't work well for CPU mappings because you need to hold
the reservation lock to make sure that the information stay valid and you
can't hold a lock while returning from a page fault.
resolvable, by using other types of locks. And I guess for now dma-buf
exporters should always handle this CPU mapping VS. invalidation contention if
they support mmap().
It is resolvable so with some invalidation notify, a decent importers could
also handle the contention well.
IIUC now the only concern is importer device drivers are easier to do
something wrong, so move CPU mapping things to exporter. But most of the
exporters are also device drivers, why they are smarter?
And there are increasing mapping needs, today exporters help handle CPU primary
mapping, tomorrow should they also help on all other mappings? Clearly it is
not feasible. So maybe conditionally give trust to some importers.
Thanks,
Yilun