Re: [PATCH v1 tty] 8250: microchip: pci1xxxx: Refactor TX Burst code to use pre-existing APIs

From: Rengarajan.S
Date: Mon Mar 04 2024 - 23:16:37 EST

Next message: Bagas Sanjaya: "Re: [PATCH 6.6 000/143] 6.6.21-rc1 review"
Previous message: Kelly Hung(洪嘉莉): "RE: [PATCH] ARM: dts: aspeed: asus: Add ASUS X4TF BMC"
In reply to: Jiri Slaby: "Re: [PATCH v1 tty] 8250: microchip: pci1xxxx: Refactor TX Burst code to use pre-existing APIs"
Next in thread: Jiri Slaby: "Re: [PATCH v1 tty] 8250: microchip: pci1xxxx: Refactor TX Burst code to use pre-existing APIs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Jiri,

On Mon, 2024-03-04 at 07:19 +0100, Jiri Slaby wrote:
> [Some people who received this message don't often get email from
> jirislaby@xxxxxxxxxx. Learn why this is important at
> https://aka.ms/LearnAboutSenderIdentification ;]
>
> EXTERNAL EMAIL: Do not click links or open attachments unless you
> know the content is safe
>
> On 04. 03. 24, 5:37, Rengarajan.S@xxxxxxxxxxxxx wrote:
> > Hi Jiri,
> >
> > On Fri, 2024-02-23 at 10:26 +0100, Jiri Slaby wrote:
> > > EXTERNAL EMAIL: Do not click links or open attachments unless you
> > > know the content is safe
> > >
> > > On 23. 02. 24, 10:21, Rengarajan.S@xxxxxxxxxxxxx wrote:
> > > > On Fri, 2024-02-23 at 07:08 +0100, Jiri Slaby wrote:
> > > > > EXTERNAL EMAIL: Do not click links or open attachments unless
> > > > > you
> > > > > know the content is safe
> > > > >
> > > > > On 22. 02. 24, 14:49, Rengarajan S wrote:
> > > > > > Updated the TX Burst implementation by changing the
> > > > > > circular
> > > > > > buffer
> > > > > > processing with the pre-existing APIs in kernel. Also
> > > > > > updated
> > > > > > conditional
> > > > > > statements and alignment issues for better readability.
> > > > >
> > > > > Hi,
> > > > >
> > > > > so why are you keeping the nested double loop?
> > > > >
> > > >
> > > > Hi, in order to differentiate Burst mode handling with byte
> > > > mode
> > > > had
> > > > seperate loops for both. Since, having single while loop also
> > > > does
> > > > not
> > > > align with rx implementation (where we have seperate handling
> > > > for
> > > > burst
> > > > and byte) have retained the double loop.
> > >
> > > So obviously, align RX to a single loop if possible. The current
> > > TX
> > > code
> > > is very hard to follow and sort of unmaintainable (and buggy).
> > > And
> > > IMO
> > > it's unnecessary as I proposed [1]. And even if RX cannot be one
> > > loop,
> > > you still can make TX easy to read as the two need not be the
> > > same.
> > >
> > > [1]
> > > https://lore.kernel.org/all/b8325c3f-bf5b-4c55-8dce-ef395edce251@xxxxxxxxxx/
> >
> >
> > while (data_empty_count) {
> >     cnt = CIRC_CNT_TO_END();
> >     if (!cnt)
> >       break;
> >     if (cnt < UART_BURST_SIZE || (tail & 3)) { // is_unaligned()
> >       writeb();
> >       cnt = 1;
> >     } else {
> >       writel()
> >       cnt = UART_BURST_SIZE;
> >     }
> >     uart_xmit_advance(cnt);
> >     data_empty_count -= cnt;
> > }
> >
> > With the above implementation we are observing performance drop of
> > 2
> > Mbps at baud rate of 4 Mbps. The reason for this is the fact that
> > for
> > each iteration we are checking if the the data need to be processed
> > via
> > DWORDs or Bytes. The condition check for each iteration is causing
> > the
> > drop in performance.
>
> Hi,
>
> the check is by several orders of magnitude faster than the I/O
> proper.
> So I don't think that's the root cause.
>
> > With the previous implementation(with nested loops) the performance
> > is
> > found to be around 4 Mbps at baud rate of 4 Mbps. In that
> > implementation we handle sending DWORDs continuosly until the
> > transfer
> > size < 4. Can you let us know any other alternatives for the above
> > performance drop.
>
> Could you attach the patch you are testing?

Please find the updated pci1xxxx_process_write_data

u32 xfer_cnt;

while (*valid_byte_count) {
xfer_cnt = CIRC_CNT_TO_END(xmit->head, xmit->tail,
UART_XMIT_SIZE);

if (!xfer_cnt)
break;

if (xfer_cnt < UART_BURST_SIZE || (xmit->tail & 3)) {
writeb(xmit->buf[xmit->tail], port->membase +
UART_TX_BYTE_FIFO);
xfer_cnt = UART_BYTE_SIZE;
} else {
writel(*(u32 *)&xmit->buf[xmit->tail],
port->membase + UART_TX_BURST_FIFO);
xfer_cnt = UART_BURST_SIZE;
}

uart_xmit_advance(port, xfer_cnt);
*data_empty_count -= xfer_cnt;
*valid_byte_count -= xfer_cnt;
}

Testing is done via minicom by transferring a 10 MB file at 4 Mbps,

After the minicom transfer with single instance:

Previous implementation(Nested While Loops):
Transferred 10 MB at 3900000 CPS

Current implementation:
Transferred 10 MB at 2459999 CPS

Thanks,
Rengarajan S

>
> thanks,
> --
> js
> suse labs
>

Next message: Bagas Sanjaya: "Re: [PATCH 6.6 000/143] 6.6.21-rc1 review"
Previous message: Kelly Hung(洪嘉莉): "RE: [PATCH] ARM: dts: aspeed: asus: Add ASUS X4TF BMC"
In reply to: Jiri Slaby: "Re: [PATCH v1 tty] 8250: microchip: pci1xxxx: Refactor TX Burst code to use pre-existing APIs"
Next in thread: Jiri Slaby: "Re: [PATCH v1 tty] 8250: microchip: pci1xxxx: Refactor TX Burst code to use pre-existing APIs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]