Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

From: Daniel Almeida
Date: Mon Dec 16 2024 - 06:01:27 EST




> On 16 Dec 2024, at 07:23, Abdiel Janulgue <abdiel.janulgue@xxxxxxxxx> wrote:
>
>
>
> On 13/12/2024 21:08, Daniel Almeida wrote:
>> Hi Robin,
>>> On 13 Dec 2024, at 12:28, Robin Murphy <robin.murphy@xxxxxxx> wrote:
>>>
>>> On 13/12/2024 2:47 pm, Daniel Almeida wrote:
>>> [...]
>>>>>> + /// Returns the CPU-addressable region as a slice.
>>>>>> + pub fn cpu_buf(&self) -> &[T]
>>>>>> + {
>>>>>> + // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>>> + // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>>> + unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
>>>>>
>>>>> Immutable slices require that the data does not change while the
>>>>> reference is live. Is that the case? If so, your safety comment should
>>>>> explain that.
>>>>>
>>>>>> + }
>>>>>> +
>>>>>> + /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>>>>>> + pub fn cpu_buf_mut(&mut self) -> &mut [T]
>>>>>> + {
>>>>>> + // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>>> + // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>>> + unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
>>>>>
>>>>> Mutable slices require that the data is not written to *or read* by
>>>>> anybody else while the reference is live. Is that the case? If so,
>>>>> your safety comment should explain that.
>>>>>
>>>> The buffer will probably be shared between the CPU and some hardware device, since this is the
>>>> point of the dma mapping API.
>>>> It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
>>>> place while the slices above are alive.
>>>
>>> Hmm, that sounds troublesome... the nature of coherent allocations is that both CPU and device may access them at any time, and you can definitely expect ringbuffer-style usage models where a CPU is writing to part of the buffer while the device is reading/writing another part, but also cases where a CPU needs to poll for a device write to a particular location.
>>>
>> Ok, I had based my answer on some other drivers I’ve worked on in the past where the approach I cited would work.
>> I can see it not working for what you described, though.
>> This is a bit unfortunate, because it means we are back to square one, i.e.: back to read() and write() functions and
>> to the bound on `Copy`. That’s because, no matter how you try to dress this, there is no way to give safe and direct access
>> to the underlying memory if you can’t avoid situations where both the CPU and the device will be accessing the memory
>> at the same time.
>
> This is unfortunate indeed. Thanks Alice for pointing out the limitations of slice.
>
> Btw, do we have any other concerns in going back to plain old raw pointers instead? i.e.,
>
> pub fn read(&self, index: usize) -> Result<T> {
> if index >= self.count {
> return Err(EINVAL);
> }
>
> let ptr = self.cpu_addr.wrapping_add(index);
> // SAFETY: We just checked that the index is within bounds.
> Ok(unsafe { ptr.read() })
> }
>
> pub fn write(&self, index: usize, value: &T) -> Result
> where
> T: Copy,
> {
> if index >= self.count {
> return Err(EINVAL);
> }
>
> let ptr = self.cpu_addr.wrapping_add(index);
> // SAFETY: We just checked that the index is within bounds.
> unsafe { ptr.write(*value) };
> Ok(())
> }
>
>> I guess the only improvement that could be made over the approach used for v2 is to at least use copy_nonoverlapping
>> instead,
>
> You mean introduce something like read_raw(dst: *mut u8,...) and write_raw(&self, src: *const u8,...)?


What I mean is that you don’t have to read (and bounds-check) a single index per function call.

Using copy_nonoverlapping lets you read/write multiple values with only a single function call and a
single check. It’s basically equivalent to a memcpy.


>
> Regards,
> Abdiel