Re: [PATCH] x86/vdso: Add prctl to set per-process VDSO load

From: Richard Larocque
Date: Tue Sep 16 2014 - 21:18:13 EST


On Tue, Sep 16, 2014 at 5:27 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Tue, Sep 16, 2014 at 5:05 PM, Richard Larocque <rlarocque@xxxxxxxxxx> wrote:
>> Adds new prctl calls to enable or disable VDSO loading for a process
>> and its children.
>>
>> The PR_SET_DISABLE_VDSO call takes one argument, which is interpreted as
>> a boolean value. If true, it disables the loading of the VDSO on exec()
>> for this process and any children created after this call. A false
>> value unsets the flag.
>>
>> The PR_GET_DISABLE_VDSO option returns a non-negative true value if VDSO
>> loading has been disabled for this process, zero if it has not been
>> disabled, and a negative value in case of error.
>>
>> These prctl calls are hidden behind a new Kconfig,
>> CONFIG_VDSO_DISABLE_PRCTL. This feature is available only on x86.
>>
>> The command line option vdso=0 overrides the behavior of
>> PR_SET_DISABLE_VDSO, however, PR_GET_DISABLE_VDSO will coninue to return
>> whetever setting was last set with PR_SET_DISABLE_VDSO.
>>
>> Signed-off-by: Richard Larocque <rlarocque@xxxxxxxxxx>
>> ---
>> This patch is part of some work to better handle times and CRIU migration.
>> I suspect that there are other use cases out there, so I'm offering this
>> patch separately.
>>
>> When considering CRIU migration and times, we put some thought into how
>> to handle the rdtsc instruction. If we migrate between machines or across
>> reboots, the migrated process will see values that could break its assumptions
>> about how rdtsc is supposed to work.
>
> I don't get it.
>
> If __vdso_clock_gettime returns the wrong value in any scenario, we
> should fix that. Simiarly, CRIU *already works*, unless there's
> something I don't know of.

Right. As far as I know, there's nothing wrong with the use of RDTSC
in the vDSO following a migration. The problem is that some
applications might use RDTSC outside of the vDSO. If they save the
returned values, then compare pre- and post- migration values, bad
things could happen (in theory).

Anything we do to try to trap and handle the use of RDTSC in wider
userspace will affect its use in the vDSO, too. In some situations,
it might be nice to run applications with no vDSO and PR_TSC_SIGSEGV,
just to make sure they don't have any heavy reliance on the TSC. It
would be nice if those applications didn't crash when they called
clock_gettime().

Another alternative is to trap and adjust the RDTSC. That might be a
viable option for applications that care about reliable RDTSC behavior
and migration, but don't care about performance. I think it makes
sense to disable the vDSO in that case, rather than trap on every call
that it makes.

> That being said, I would like an option to gate off RDTSC for a
> process and its children in order to make PR_TSC_SIGSEGV more useful.
> All the prerequisites are there now.

Agreed. That's what this patch is attempting to do, and that's the
main reason why I figured it was worth submitting independent of any
other time-related work.

> What problem are you trying to solve exactly?

Eventually, we'd like to make it so that neither RDTSC nor
CLOCK_MONOTONIC can go backwards following a migration.

The fix for RDTSC starts here. Building on this patch as a base, we
can either ban it from being used entirely, or write some code to
adjust its value as necessary.

The CLOCK_MONOTONIC fix will be a different patch stack. We're
currently hoping to do that without disable the vDSO, but that's
another discussion.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/