Re: [PATCH v5 3/3] mm/khugepaged: maintain page cache uptodate flag

From: Song Liu
Date: Wed Mar 29 2023 - 12:54:07 EST




> On Mar 24, 2023, at 6:31 AM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Fri, Mar 24, 2023 at 06:03:37AM +0000, Song Liu wrote:
>>
>>
>>> On Mar 23, 2023, at 8:30 PM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>>
>> [...]
>>
>>>
>>> The Uptodate flag check needs to be done by the caller; the
>>> find_get_page() family return !uptodate pages.
>>>
>>> But find_get_page() does not advertise itself as NMI-safe. And I
>>> think it's wrong to try to make it NMI-safe. Most of the kernel is
>>> not NMI-safe. I think it's incumbent on the BPF people to get the
>>> information they need ahead of taking the NMI. NMI handlers are not
>>> supposed to be doing a huge amount of work! I don't really understand
>>> why it needs to do work in NMI context; surely it can note the location of
>>> the fault and queue work to be done later (eg on irq-enable, task-switch
>>> or return-to-user)
>>
>> The use case here is a profiler (similar to perf-record). Parsing the
>> build id in side the NMI makes the profiler a lot simpler. Otherwise,
>> we will need some post processing for each sample.
>
> Simpler for you, maybe. But this is an NMI! It's not supposed to
> be doing printf-formatting or whatever, much less poking around
> in the file cache. Like perf, it should record a sample and then
> convert that later. Maybe it can defer to a tasklet, but i think
> scheduling work is a better option.
>
>> OTOH, it is totally fine if build_id_parse() fails some time, say < 5%.
>> The profiler output is still useful in such cases.
>>
>> I guess the next step is to replace find_get_page() with a NMI-safe
>> version?
>
> No, absolutely not. Stop doing so much work in an NMI.

While I understand the concern, it is not something we can easily remove,
as there are users rely on this feature. How about we discuss this at
upcoming LSFMMBPF?

Thanks,
Song