Re: [RFC][PATCH] x86/mm: Sync all vmalloc mappings before text_poke()

From: Mathieu Desnoyers
Date: Thu Apr 30 2020 - 12:35:37 EST


----- On Apr 30, 2020, at 12:30 PM, rostedt rostedt@xxxxxxxxxxx wrote:

> On Thu, 30 Apr 2020 12:18:22 -0400 (EDT)
> Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>
>> ----- On Apr 30, 2020, at 12:16 PM, rostedt rostedt@xxxxxxxxxxx wrote:
>>
>> > On Thu, 30 Apr 2020 11:20:15 -0400 (EDT)
>> > Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>> >
>> >> > The right fix is to call vmalloc_sync_mappings() right after allocating
>> >> > tracing or perf buffers via v[zm]alloc().
>> >>
>> >> Either right after allocation, or right before making the vmalloc'd data
>> >> structure visible to the instrumentation. In the case of the pid filter,
>> >> that would be the rcu_assign_pointer() which publishes the new pid filter
>> >> table.
>> >>
>> >> As long as vmalloc_sync_mappings() is performed somewhere *between* allocation
>> >> and publishing the pointer for instrumentation, it's fine.
>> >>
>> >> I'll let Steven decide on which approach works best for him.
>> >
>> > As stated in the other email, I don't see it having anything to do with
>> > vmalloc, but with the per_cpu() allocation. I'll test this theory out by
>> > not even allocating the pid masks and touching the per cpu data at every
>> > event to see if it crashes.
>>
>> As pointed out in my other email, per-cpu allocation uses vmalloc when
>> size > PAGE_SIZE.
>
> And as I replied:
>
> buf->data = alloc_percpu(struct trace_array_cpu);
>
> struct trace_array_cpu {
> atomic_t disabled;
> void *buffer_page; /* ring buffer spare */
>
> unsigned long entries;
> unsigned long saved_latency;
> unsigned long critical_start;
> unsigned long critical_end;
> unsigned long critical_sequence;
> unsigned long nice;
> unsigned long policy;
> unsigned long rt_priority;
> unsigned long skipped_entries;
> u64 preempt_timestamp;
> pid_t pid;
> kuid_t uid;
> char comm[TASK_COMM_LEN];
>
> bool ignore_pid;
> #ifdef CONFIG_FUNCTION_TRACER
> bool ftrace_ignore_pid;
> #endif
> };
>
> That doesn't look bigger than PAGE_SIZE to me.

Let me point you to:

pcpu_alloc()
calling pcpu_create_chunk()

which is then responsible for calling the underlying
pcpu_mem_zalloc() which then uses vmalloc. So batching
those allocations can be responsible for using vmalloc'd
memory rather than kmalloc'd even though the allocation
size is smaller than 4kB.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com