Re: [RFC PATCH] mmc: core: Disable REQ_FUA if the eMMC supports an internal cache
From: Adrian Hunter
Date: Fri Mar 03 2023 - 09:41:39 EST
On 3/03/23 14:01, Ulf Hansson wrote:
> On Fri, 3 Mar 2023 at 12:40, Christian Löhle <CLoehle@xxxxxxxxxxxxxx> wrote:
>>
>>
>>>>
>>>> REQ_FUA is in general supported for eMMC cards, which translates into so called "reliable writes". To support these write operations, the CMD23 (MMC_CAP_CMD23), needs to be supported by the mmc host too, which is common but not always the case.
>>>>
>>>> For some eMMC devices, it has been reported that reliable writes are quite costly, leading to performance degradations.
>>>>
>>>> In a way to improve the situation, let's avoid announcing REQ_FUA support if the eMMC supports an internal cache, as that allows us to rely solely on flush-requests (REQ_OP_FLUSH) instead, which seems to be a lot cheaper.
>>>> Note that, those mmc hosts that lacks CMD23 support are already using this type of configuration, whatever that could mean.
>>>
>>> Just note that reliable write is strictly weaker than turning cache off/flushing, if card loses power during cache off/flush programming / busy, sector-wise atomicity is not mandated by the spec.
>>> (And that is assuming cache off/flush is actually respected by the card as intended by the spec, should some cards be checked?) Maybe some FS people can also chime in?
>>
>> Nevermind, the sector-wise atomicity should not matter on 5.1 cards or if the block length isn't being played with, which it isn't in our case.
>> If reliable write is implemented only according to spec, I don't see why the cache flushing should be less expensive, which would only make sense if
>> a) < sector chunks are committed to flash
>> b) reliable write is implemented much stricter than the spec, ensuring atomicity for the entire write.
>
> Right, I agree!
>
> Note 1) Reliable write was introduced way before cache management in
> the eMMC spec. So, if the support for reliable write would have a
> stricter implementation than needed, I would not be surprised.
I am not sure when you say stricter than needed. Historically
file systems assumed that sectors are updated atomically i.e.
there is never a sector with a mixture of old and new data.
The eMMC spec does not guarantee that, except for reliable
write.
File systems may use REQ_FUA for important information, like the
superblock or a journal commit record, so using reliable write
for REQ_FUA would seem to give better protection against file system
corruption than a cache flush which could leave a sector
half-written.
On the other hand, sudden power loss is probably rare in battery
powered systems because they are designed to monitor the battery
power and shutdown when it gets too low.
And file systems can use checksums to detect half-written updates.
And there is anyway no protection for other (non REQ_FUA) writes a
file system might do and expect not to tear sectors.
And you are more likely to smash the screen than bounce the battery
out and cause an unrecoverable file system error.
Nevertheless, the commit message of this patch reads like the change
is an optimization, whereas it seems more like a policy change.
The commit message should perhaps say something like:
"The consensus is that the benefit of improved performance by not
using reliable-write for REQ_FUA is much greater than any potential
benefit that reliable-write might provide to avoid file system
corruption in the event of sudden power loss."
As for allowing for the policy to be overridden, perhaps an mmc_core
module option?
>
> Note 2) In the eMMC v5.1 spec, the cache flushing support has been
> extended to allow an explicit barrier operation. Perhaps, we should
> let that option take precedence over a regular flush+barrier, for
> REQ_OP_FLUSH!?
>
>>
>> I guess the cards which increase performance do b)? Or something else?
>
> That's the tricky part to know, as it's the internals of the eMMC.
It is the natural conclusion though. The eMMC probably does not
update mapping information with every write, instead if power is
lost, it scans the updated areas at the next initialization. (The
power-off notify feature would commit the mapping information to
media to avoid that). So a reliable write might have to:
1. write information to record that the old mapping
should be used, not what might be discovered by scanning
2. do the actual write
3. write mapping information to record the new mapping
>
> Although, it seems like both Avri (WDC) and Bean (Micron) would be
> happy to proceed with $subject patch, which makes me more comfortable
> to move forward.
>
>> Anyway regarding FUA i don't have any concerns regarding reliability with cache flush.
>> I can add some performance comparisons with some eMMCs I have around though.
>
> That would be great, thanks a lot for helping out with testing!
>
> Kind regards
> Uffe