I liked the patch you were pushing to request the *page* containing the requested bytes instead of the *block* containing the requested bytes.

For the misaligned partition problem, I was thinking we should change the direct_access API to return a phys_addr_t instead of a pfn. That way we can return something that isn't actually page aligned, and DAX can take care of making sure it doesn't overshoot the end.

Linda Knippers <linda.knippers@xxxxxx> writes:

>>> It causes the physical block size to be PAGE_SIZE but the
>>> logical block size is still 512. However, the minimum_io_size
>>> is now 4096 (same as physical block size, I assume). The
>>> optimal_io_size is still 0. What does that mean?
>> physical block size - device's internal block size
>> logical block size - addressable unit
> Right, but it's still reported as 512 and that doesn't work.

Understood. :)

>> optimal io size - device's preferred unit for streaming
> So 0 is ok.


>> We can change the block device to export logical/physical block sizes of
>> PAGE_SIZE. However, when persistent memory support comes to platforms
>> that support page sizes > 32k, xfs will again run into problems (Dave
>> Chinner mentioned that xfs can't deal with logical block sizes >32k.)
>> Arguably, you can use pmem and dax on such platforms using RAM today for
>> testing. Do we care about breaking that?
> I would think so. AARCH64 uses 64k pages today.

So does powerpc, but I guess nobody cares about that anymore. ;-) If
the logical block size is smaller than the page size, we're going to
have to deal with sub-page I/O. For now, we can do as Boaz suggested,
and just turn off dax for those configurations. We could also just
revert the patch that introduced this problem. I really don't know who
is going to care about O_DIRECT I/O performance to a persistent memory
block device.

Willy? What was the real motivation there?

> I think Documentation/filesystems/dax.txt could use a little update
> too. It has a section "Implementation Tips for Block Driver Writers"
> that makes it sound easy but now I wonder if it even works with the
> example ram drivers. Should we be able to read any 512 byte
> "sector"?

If the logical block size is 512 bytes, then you have to be able to do
(direct) I/O to any 512 byte sector. Simple as that.

