Noralf TrÃnnes <noralf@xxxxxxxxxxx> writes:
Den 25.10.2018 18.29, skrev Eric Anholt:I copied the DT from the adafruit tree, which has it at 32mhz. System
Eric Anholt <eric@xxxxxxxxxx> writes:Ah, nice to see this happening!
I was going to start working on making the vc4 driver work withAlso, with these patches and the format modifier patch I just sent, mesa
tinydrm panels, but it turned out tinydrm didn't have the panel I had
previously bought. So, last night I ported the fbtft staging
driver over to DRM.
It seems to work (with DT at
https://github.com/anholt/linux/commits/drm-misc-next-hx8357d) --
fbdev works great including rotated, and so does modetest. However,
when X11 comes up at 16bpp, I get:
https://photos.app.goo.gl/8tuhzPFFoDGamEfk8
If I have tinydrm set a preferred bpp of 24, X looks great. Noralf,
any ideas?
with vc4 is now working with this driver on this branch:
https://gitlab.freedesktop.org/anholt/mesa/commits/kmsro
Getting hw rendering was one of the advantages I saw DRM could provide
over fbdev on these displays. Little did I know how complicated graphics
was outside fbdev, so I was unable to realise this myself.
The current solution to get hw rendering is to have a userspace process
that continously copies the framebuffer:
https://github.com/tasanakorn/rpi-fbcp
It's used by some of the small DIY handheld game consoles that run
emulators which requires hw rendering.
Now I wonder how we can improve performance of the SPI updates.At what SPI speed are you running? The datasheet for most of these
display controllers list the max speed as 10MHz, but almost all of them
can go faster. Some are reported going as high as 70-80MHz. That's for
the pixel data transfer, not the commands. tinydrm/mipi-dbi.c sends
commands at 10MHz and pixels at full speed (mipi_dbi_spi_cmd_max_speed()).
Most panels I have run at 32MHz or 48MHz.
performance seems to be limited by the copy and format conversion I
think -- in particular, I wonder if we shouldn't be doing our dirty
copies in our own workqueue. I haven't managed to get any really good
profiling data yet, though.
glxgears at 128x128 is nice and smooth, and at 480x320 it's 6fps.
That's not filling 32mhz of SPI. On the other hand, I would have
expected the uncached reads for the 4-to-2 swapped conversion to be able
to go faster than 3.5mb/sec. If it's the uncached reads, we could at
least use NEON for the copy to cached, and probably even do the whole
conversion in NEON with a bit more thought.
Another option: use a vc4 RCL to do RGBA8888 to RGB565, since that will
be less pressure on the bus. But then, I suppose I should just figure
out what's going on that makes X11 at RGBA8888 break, and fix that so we
don't even have to do that conversion.
I keep hoping there's some way we could feed output from the DISPSLAVE
HVS register directly to the SPI master -- FIFO32 gets us close (two
16-bit pixels packed next to each other, leftmost in the lower 2 bytes),
but the need for byte swapping (as opposed to R/B swapping) I think
makes it impossible.
Almost all the time is spent in the SPI transfer, so every hz counts. OnThat's weird. My specs say CDIV must be a *power* of two, with lower
the Pi there's byte swapping because the DMA capable SPI controller can't
do 16-bit (tinydrm_swab16()). If I remember correctly this has negligible
impact on performance.
The SPI controller/driver on the Pi has some restrictions on the speeds
to choose from because the divisor has to be a multiple of two
(bcm2835_spi_transfer_one()).
values rounded down. I guess that means we might be running things
fast, not slow, though (and at 32mhz out of 250, we should be getting
the same CDIV).