Re: RTL8723BE performance regression

From: Pkshih
Date: Sun May 13 2018 - 22:51:22 EST


On Wed, 2018-05-09 at 13:33 -0700, JoÃo Paulo Rechi Vita wrote:
> On Tue, May 8, 2018 at 1:37 AM, Pkshih <pkshih@xxxxxxxxxxx> wrote:
> > On Mon, 2018-05-07 at 14:49 -0700, JoÃo Paulo Rechi Vita wrote:
> >> On Tue, May 1, 2018 at 10:58 PM, Pkshih <pkshih@xxxxxxxxxxx> wrote:
> >> > On Wed, 2018-05-02 at 05:44 +0000, Pkshih wrote:
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: JoÃo Paulo Rechi Vita [mailto:jprvita@xxxxxxxxx]
> >> >> > Sent: Wednesday, May 02, 2018 6:41 AM
> >> >> > To: Larry Finger
> >> >> > Cc: Steve deRosier; èåå; Pkshih; Birming Chiu; Shaofu; Steven Ting; Chaoming_Li; Kalle
> Valo;
> >> >> > linux-wireless; Network Development; LKML; Daniel Drake; JoÃo Paulo Rechi Vita; linux@endl
> ess
> >> m.c
> >> >> om
> >> >> > Subject: Re: RTL8723BE performance regression
> >> >> >
> >> >> > On Tue, Apr 3, 2018 at 7:51 PM, Larry Finger <Larry.Finger@xxxxxxxxxxxx> wrote:
> >> >> > > On 04/03/2018 09:37 PM, JoÃo Paulo Rechi Vita wrote:
> >> >> > >>
> >> >> > >> On Tue, Apr 3, 2018 at 7:28 PM, Larry Finger <Larry.Finger@xxxxxxxxxxxx>
> >> >> > >> wrote:
> >> >> > >>
> >> >> > >> (...)
> >> >> > >>
> >> >> > >>> As the antenna selection code changes affected your first bisection, do
> >> >> > >>> you
> >> >> > >>> have one of those HP laptops with only one antenna and the incorrect
> >> >> > >>> coding
> >> >> > >>> in the FUSE?
> >> >> > >>
> >> >> > >>
> >> >> > >> Yes, that is why I've been passing ant_sel=1 during my tests -- this
> >> >> > >> was needed to achieve a good performance in the past, before this
> >> >> > >> regression. I've also opened the laptop chassis and confirmed the
> >> >> > >> antenna cable is plugged to the connector labeled with "1" on the
> >> >> > >> card.
> >> >> > >>
> >> >> > >>> If so, please make sure that you still have the same signal
> >> >> > >>> strength for good and bad cases. I have tried to keep the driver and the
> >> >> > >>> btcoex code in sync, but there may be some combinations of antenna
> >> >> > >>> configuration and FUSE contents that cause the code to fail.
> >> >> > >>>
> >> >> > >>
> >> >> > >> What is the recommended way to monitor the signal strength?
> >> >> > >
> >> >> > >
> >> >> > > The btcoex code is developed for multiple platforms by a different group
> >> >> > > than the Linux driver. I think they made a change that caused ant_sel to
> >> >> > > switch from 1 to 2. At least numerous comments at
> >> >> > > github.com/lwfinger/rtlwifi_new claimed they needed to make that change.
> >> >> > >
> >> >> > > Mhy recommended method is to verify the wifi device name with "iw dev". Then
> >> >> > > using that device
> >> >> > >
> >> >> > > sudo iw dev <dev_name> scan | egrep "SSID|signal"
> >> >> > >
> >> >> >
> >> >> > I have confirmed that the performance regression is indeed tied to
> >> >> > signal strength: on the good cases signal was between -16 and -8 dBm,
> >> >> > whereas in bad cases signal was always between -50 to - 40 dBm. I've
> >> >> > also switched to testing bandwidth in controlled LAN environment using
> >> >> > iperf3, as suggested by Steve deRosier, with the DUT being the only
> >> >> > machine connected to the 2.4 GHz radio and the machine running the
> >> >> > iperf3 server connected via ethernet.
> >> >> >
> >> >>
> >> >> We have new experimental results in commit af8a41cccf8f46 ("rtlwifi: cleanup
> >> >> 8723be ant_sel definition"). You can use the above commit and do the same
> >> >> experiments (with ant_sel=0, 1 and 2) in your side, and then share your results.
> >> >> Since performance is tied to signal strength, you can only share signal strength.
> >> >>
> >> >
> >> > Please pay attention to cold reboot once ant_sel is changed.
> >> >
> >>
> >> I've tested the commit mentioned above and it fixes the problem on top
> >> of v4.16 (in addition to the latest wireless-drivers-next also been
> >> fixed as it already contains such commit). On v4.15, we also need the
> >> following commits before "af8a41cccf8f rtlwifi: cleanup 8723be ant_sel
> >> definition" to have a good performance again:
> >>
> >>ÂÂÂ874e837d67d0 rtlwifi: fill FW version and subversion
> >>ÂÂÂa44709bba70f rtlwifi: btcoex: Add power_on_setting routine
> >>ÂÂÂ40d9dd4f1c5d rtlwifi: btcoex: Remove global variables from btcoex
> >
> > v4.15 isn't longterm version and had been EOL.
> >
>Â
> Right, but this is a performace regression in comparison to v4.11, so
> if "af8a41cccf8f rtlwifi: cleanup 8723be ant_sel definition" is marked
> for stable, shouldn't these other patches be brought as well? All
> releases since v4.11 are probably affected, but honestly I don't have
> a strong understanding of how the stable trees operate in situations
> like this.
>Â

see below.

> >>
> >> Surprisingly, it seems forcing ant_sel=1 is not needed anymore on
> >> these machines, as the shown by the numbers bellow (ant_sel=0 means
> >> that actually no parameter was passed to the module). I have powered
> >> off the machine and done a cold boot for every test. It seems
> >> something have changed in the antenna auto-selection code since v4.11,
> >> the latest point where I could confirm we definitely need to force
> >> ant_sel=1. I've been trying to understand what causes this difference,
> >> but haven't made progress on that so far, so any suggestions are
> >> appreciated (we are trying to decide if we can confidently drop the
> >> downstream DMI quirks for these specific machines).
> >>
> > I think your rtl8723be module programed correct efuse content, so it
> > works properly with ant_sel=0, and quirk isn't required for your
> > machine.
> >
> >>ÂÂÂw-d-n ant_sel=0: -14.00 dBm,ÂÂ69.5 Mbps -> good
> >>ÂÂÂw-d-n ant_sel=1: -10.00 dBm,ÂÂ41.1 Mbps -> good
> >>ÂÂÂw-d-n ant_sel=2: -44.00 dBm,ÂÂÂ607 kbps -> bad
> >>
> >>ÂÂÂv4.16 ant_sel=0: -12.00 dBm,ÂÂ63.0 Mbps -> good
> >>ÂÂÂv4.16 ant_sel=1: - 8.00 dBm,ÂÂ69.0 Mbps -> good
> >>ÂÂÂv4.16 ant_sel=2: -50.00 dBm,ÂÂÂ224 kbps -> bad
> >>
> >>ÂÂÂv4.15 ant_sel=0: - 8.00 dBm,ÂÂ33.0 Mbps -> good
> >>ÂÂÂv4.15 ant_sel=1: -10.00 dBm,ÂÂ38.1 Mbps -> good
> >>ÂÂÂv4.15 ant_sel=2: -48.00 dBm,ÂÂÂ206 kbps -> bad
> >>
> >
> > With your results, the efuse content is programmed as one or two antenna
> > on AUX path.
> >
>Â
> With v4.11 I had good performance results on this very same machine
> (thus same efuse contents) only when passing ant_sel=1, so there has
> to be some change on the code that parses the efuse contents and
> decides which antenna will be used.
>Â

Since btcoex control TDMA parameters for WiFi and BT, antenna related code
is put in btcoex. That's why ant_sel is used by btcoex.
In v4.12, we upgraded btcoex and firmware in order to yield better balance
between WiFi and BT, meanwhile code flow had some changes. So, the singleÂ
commit af8a41cccf8f ("rtlwifi: cleanup 8723be ant_sel definition") won'tÂ
work on v4.11. In other words, if you want v4.11 work properly, you need to
apply all changes of btcoex.


The parser of efuse isn't changed, and I think the reason why v4.11 needs
ant_sel=1 is the same as above.

Regards
PK