Re: [PATCH] x86: do not allow to optimize flag_is_changeable_p()

From: krzysztof . h1
Date: Tue Sep 30 2008 - 04:27:41 EST


> Yinghai Lu wrote:
> > On Mon, Sep 29, 2008 at 11:14 PM, Jeremy Fitzhardinge <jeremy@xxxxxxxx>
> wrote:
> >
> >> Krzysztof Helt wrote:
> >>
> >>> From: Krzysztof Helt <krzysztof.h1@xxxxx>
> >>>
> >>> The flag_is_changeable_p() is used by
> >>> has_cpuid_p() which can return different results
> >>> in the code sequence below:
> >>>
> >>> if (!have_cpuid_p())
> >>> identify_cpu_without_cpuid(c);
> >>>
> >>> /* cyrix could have cpuid enabled via c_identify()*/
> >>> if (!have_cpuid_p())
> >>> return;
> >>>
> >>> Otherwise, the gcc 3.4.6 optimizes these two calls
> >>> into one which make the code not working correctly.
> >>> Cyrix cpus have the CPUID instruction enabled but
> >>> it is not detected due to the gcc optimization.
> >>> Thus the ARR registers (mtrr like) are not detected
> >>> on such a cpu.
> >>>
> >>>
> >> If "asm volatile" changes the code and fixes the bug, it seems like
> >> you're making use of an undocumented - or at least non-portable -
> behaviour.
> >>

Why you call it undocumented. This is information you can find with "info gcc" in the Extendend Asm section:

If your assembler instructions access memory in an unpredictable
fashion, add `memory' to the list of clobbered registers. This will
cause GCC to not keep memory values cached in registers across the
assembler instruction and not optimize stores or loads to that memory.
You will also want to add the `volatile' keyword if the memory affected
is not listed in the inputs or outputs of the `asm', as the `memory'
clobber does not count as a side-effect of the `asm'. If you know how
large the accessed memory is, you can add it as input or output but if
this is not known, you should add `memory'.

> >> Does adding a "memory" clobber also fix the problem? That would have
> >> better defined characteristics.
> >>

A changeable flag bit is hardly a memory side effect. IMO, the volatile attribute is better as it says that each evaluation may have a different results despite the inputs and outputs are the same.

>
> The trouble is that flag_is_changeable_p() doesn't have any obvious
> global dependencies; it just takes a constant argument and returns a
> result. The asm() needs to be updated to have a "memory" constraint as
> a stand-in for the specific constraint of "cpu has switched into
> cpuid-supporting state".
>

See above about adding the memory constrain.

Kind regards,
Krzysztof

----------------------------------------------------------------------
Tanie i proste polaczenia telefoniczne!
Sprawdz >> http://link.interia.pl/f1f23


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/