Re: [Xen-devel] [PATCH] xen: point xen_start_info to a dummy structfor PV on HVM guests
From: Ian Campbell
Date: Wed Oct 03 2012 - 12:21:25 EST
On Wed, 2012-10-03 at 16:48 +0100, Stefano Stabellini wrote:
> On Wed, 3 Oct 2012, Ian Campbell wrote:
> > On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote:
> > > On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> > > > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > > > > xen_start_info is just NULL for them.
> > > > > > > That is problem because other parts of the code expect xen_start_info to
> > > > > > > point to something valid, for example xen_initial_domain() is defined as
> > > > > > > follow:
> > > > > > >
> > > > > > > #define xen_initial_domain() (xen_domain() && \
> > > > > > > xen_start_info->flags & SIF_INITDOMAIN)
> > > > > >
> > > > > > But anyone who calls this before xen_start_info is setup is going to get
> > > > > > a bogus result, specifically in this case they will think they are domU
> > > > > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > > > > callsites?
> > > > >
> > > > > That cannot be the case because setting up xen_start_info is the very
> > > > > first thing that is done, before even calling to C.
> > > >
> > > > On PV, yes, but you are trying to fix PVHVM here, no?
> > > >
> > > > Otherwise if this is always set before calling into C then what is the
> > > > purpose of this patch?
> > >
> > > to fix this - as PVHVM has it set to NULL and we end up de-referencing
> > > the xen_start_info and crashing. As so::
> > >
> >
> > Right, so returning to my original point: The caller here is calling
> > xen_initial_domain() *before* start info is setup. This is bogus and is
> > your actual bug, all this patch does is hide that real issue.
>
> That is because xen_start_info wasn't setup at all for PV on HVM guests.
>
> The real reason is that PV on HVM guests don't have one, but that is
> another matter. Until we get rid of all the references to xen_start_info
> outside of PV specific code, we should just assume that there is one,
> and that is already setup.
>
> One day not too far from now, we might refactor the code to never
> reference xen_start_info directly, but I don't think that now is the
> time for that. Also consider that this is the same thing we do on ARM.
We actual fill in the dummy start info with valid information on ARM
though, we don't just leave it full of zeroes.
If we do start out with start_info pointing to an uninitialised
start_info on ARM too then I would argue that this is also a mistake. We
should leave the NULL pointer in place until we setup the content of the
dummy start info -- exactly because the resulting crash indicates to us
that someone has accessed the si before we've initialised it.
> > With this "fix" the caller of xen_initial_domain shown in this trace now
> > gets a rubbish result based on the content of a dummy shared info
> > instead of the real answer from that actual shared info.
>
> That is not true. The caller gets a zero result, that is completely
> appropriate in this case, given that a PV on HVM guest doesn't have a
> start_info.
It's just a side effect of Linux zeroing its bss though and zero
happening to be the right answer for a PVHVM guest in this case.
Is it true that zero is an appropriate result for all uses of fields in
start_info on PVHVM?
> > The right fix is to fix the caller to not call xen_initial_domain()
> > until after the shared info has been setup. Maybe that means moving
> > shinfo setup earlier, or maybe it means deferring this call until later
> > in the PVHVM case.
>
> I don't think so, we should be able to call xen_initial_domain() at any
> point in the code.
>
> The best course of action is taking this fix now (making PVHVM x86
> guests behave the same way as ARM guests) and refactor all the callers to
> xen_start_info later.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/