Re: [GIT PULL] Thermal-SoC management changes for v5.2-rc1
From: Eduardo Valentin
Date: Fri May 24 2019 - 09:57:06 EST
On Fri, May 24, 2019 at 10:23:09AM +0200, Tomeu Vizoso wrote:
> On Fri, 24 May 2019 at 04:40, Eduardo Valentin <edubezval@xxxxxxxxx> wrote:
> > On Thu, May 23, 2019 at 11:46:47AM +0200, Tomeu Vizoso wrote:
> > > Hi Eduardo,
> > >
> > > I saw that for 5.1  you included a kernelci boot report for your
> > > tree, but not for 5.2. Have you found anything that should be improved
> > > in KernelCI for it to be more useful to maintainers like you?
> > Honestly, I take a couple of automated testing as input before sending
> > my pulls to Linux: (a) my local test, (b) kernel-ci, and (c) 0-day.
> > There was really no reason specifically for me to not add the report
> > from kernelci, except..
> > >
> > >  https://lore.kernel.org/lkml/20190306161207.GA7365@xxxxxxxxxxxxxxxxxxxxx/
> > >
> > > I found about this when trying to understand why the boot on the
> > > veyron-jaq board has been broken in 5.2-rc1.
> > >
> > I remember a report saying this failed, but from what I could tell from
> > the boot log, the board booted and hit terminal. But apparently, after
> > all reports from developers, the veyron-jaq boards were in a hang state.
> > That was hard for me to tell from your logs, as they looked like
> > a regular boot that hits terminal.
> > Maybe I should have looked for a specific output of a command you guys
> > run, saying "successful boot" somewhere?
> I think what is easiest and clearest is to consider the bisection
> reports as a very strong indication that something is quite wrong in
> the branch.
OK. I hear you.
> Because if a board stopped booting and the bisection found a
> suspicious patch, and reverting it gets the board booting again, then
> chances are very high that the patch in question broke that boot.
Yeah, for sure If I had understood the report properly I could have
nacked the patch.
> Do you think the wording could be improved to make it clearer? Or
> maybe some other changes to make all this more useful to maintainers
> like you?
Well, from my perspective, I need to judge if the failure on your report
is really related to my changes. Many times, specially on build errors,
we get failures that are unrelated. Build errors are more straight
forward do judge. Similarly, we need to find out if a boot issue is
caused by a change on the branch or something existing. On boot issues
from kernelci reports, I think the false negatives I have been seeing
is lab/boards failing to boot. Those can also be easy to spot as the
in most cases the kernel wont even load.
For this particular case, as I described before, the kernel would
load and hit the shell command line, but in fact it was in hang state
IIRC. That is probably why it has not straight forward to understand
from the log. Maybe a successful boot message somewhere would have
helped to spot the problem (or the opposite of it, something
saying, I was expecting to execute a command and board was