Re: [BUG] Serial/dma stall/failure after "dmaengine: fsl-edma: extract common fsl-edma code (no changes in behavior intended)"

From: Krzysztof Kozlowski
Date: Fri Jul 06 2018 - 02:16:37 EST


On 6 July 2018 at 08:05, Vinod <vkoul@xxxxxxxxxx> wrote:
> On 05-07-18, 18:23, Angelo Dureghello wrote:
>> Hi Stafan,
>>
>> On Thu, Jul 05, 2018 at 05:55:31PM +0200, Stefan Agner wrote:
>> > Hi Vinod, Hi Angelo,
>> >
>> > On 05.07.2018 14:45, Angelo Dureghello wrote:
>> > > Hi Vinod,
>> > >
>> > > On Thu, Jul 05, 2018 at 10:12:53AM +0200, Angelo Dureghello wrote:
>> > >> Hi Vinod,
>> > >>
>> > >> On Thu, Jul 05, 2018 at 01:05:52PM +0530, Vinod wrote:
>> > >> > On 04-07-18, 10:54, Krzysztof Kozlowski wrote:
>> > >> > > Hi,
>> > >> > >
>> > >> > > The commit 6ad069123f03bebe4315dea13d44845854ca6043 ("dmaengine:
>> > >> > > fsl-edma: extract common fsl-edma code (no changes in behavior
>> > >> > > intended)"), even though marked as no changes in behavior intended...
>> > >> > > make serial console with DMA broken after boot. The console just hangs
>> > >> > > - is not responsive even to SysRq. Usually after finishing boot -
>> > >> > > before or after login prompt. Sometimes login is allowed and then it
>> > >> > > hangs during printing dmesg.
>> > >> > >
>> > >> > > Board: Toradex Colibri VF50 (NXP VF500, Cortex A5, serial configured
>> > >> > > with DMA) on Iris Carrier.
>> > >> >
>> > >> > Angelo ?
>> > >> >
>> > >> sorry for this. As said i couldn't test it in any of those arm boards
>> > >> but i take a look.
>> > >>
>> > >> The only code part that changes is the initial setting up of the registers.
>> > >> I am checking that.
>> > >>
>> > >> Regards,
>> > >> Angelo
>> > >>
>> > >
>> > > I cannot find anything obviously wrong.
>> > > And i cannot test on Vybrid.
>> > > I will try to get a Vybrid V50 board to test this issue. It would
>> > > require some days and also i wil be 3 weeks off on July.
>> > >
>> > > So please revert my patch.
>> >
>> > I did not found the issue quickly. But I must say that I gave up pretty
>> > quickly. There are too many changes in a single patch which makes it
>> > hard to figure out what could be wrong. I'd rather prefer if we could
>> > drop that patch again and go through another review phase.
>> >
>> > Angelo, as far as I can see the patch has not been sent to LKML or the
>> > ARM mailing list. Especially since you do not have such a device it
>> > would have been nice to also send it to the ARM mailing list...
>> >
>> > Can you resend your last revision with CC to me/ARM mailing list?
>> >
>>
>> Thanks for looking into it.
>>
>> I have spent a lot of time on this patch and really would have
>> dma for Coldfire available. So, have ordered a Colibri / v50 board.
>>
>> Should receive it in few days and should be able to debug this issue,
>> but as said, i will be off for some weeks so looks like it is probably
>> better to revert the patch.
>>
>> My initial submit was a separate driver, to avoid such issues where
>> i cannot test, but it resulted in too much duplicated code.
>>
>> Sure, i can send the full patch to you/arm with all the fixes included
>> until now.
>
> Okay dropped now from -next. I still keeping topic/fsl around and
> collect other fixes for you guys to check.
>
> One way would be to split to common patch into multiple patch and check
> regression. That should help quickly identify the issue.

Even with show -C20% -M20% I gave up when trying to spot possible
differences in the commit. I understand that logically it is one
change and it makes sense... but it touches, moves and adds too much
making review extremely difficult. I think the splitting into smaller
chunks would be beneficial anyway.

Today's next boots fine.

Best regards,
Krzysztof