Re: [PATCH 0/2] "big hammer" for DAX msync/fsync correctness

From: Dan Williams
Date: Fri Nov 06 2015 - 11:04:11 EST


On Fri, Nov 6, 2015 at 12:06 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Thu, 5 Nov 2015, Dan Williams wrote:
>> On Wed, Oct 28, 2015 at 3:51 PM, Ross Zwisler
>> <ross.zwisler@xxxxxxxxxxxxxxx> wrote:
>> > On Wed, Oct 28, 2015 at 06:24:29PM -0400, Jeff Moyer wrote:
>> >> Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> writes:
>> >>
>> >> > This series implements the very slow but correct handling for
>> >> > blkdev_issue_flush() with DAX mappings, as discussed here:
>> >> >
>> >> > https://lkml.org/lkml/2015/10/26/116
>> >> >
>> >> > I don't think that we can actually do the
>> >> >
>> >> > on_each_cpu(sync_cache, ...);
>> >> >
>> >> > ...where sync_cache is something like:
>> >> >
>> >> > cache_disable();
>> >> > wbinvd();
>> >> > pcommit();
>> >> > cache_enable();
>> >> >
>> >> > solution as proposed by Dan because WBINVD + PCOMMIT doesn't guarantee that
>> >> > your writes actually make it durably onto the DIMMs. I believe you really do
>> >> > need to loop through the cache lines, flush them with CLWB, then fence and
>> >> > PCOMMIT.
>> >>
>> >> *blink*
>> >> *blink*
>> >>
>> >> So much for not violating the principal of least surprise. I suppose
>> >> you've asked the hardware folks, and they've sent you down this path?
>> >
>> > Sadly, yes, this was the guidance from the hardware folks.
>>
>> So it turns out we weren't asking the right question. wbinvd may
>> indeed be viable... we're still working through the caveats.
>
> Just for the record. Such a flush mechanism with
>
> on_each_cpu()
> wbinvd()
> ...
>
> will make that stuff completely unusable on Real-Time systems. We've
> been there with the big hammer approach of the intel graphics
> driver.

Noted. This means RT systems either need to disable DAX or avoid
fsync. Yes, this is a wart, but not an unexpected one in a first
generation persistent memory platform.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/