Re: [PATCH v3 0/3] counter: add GPIO-based quadrature encoder driver

From: Wadim Mueller

Date: Wed May 06 2026 - 02:50:52 EST


On 2026-05-04 18:36, William Breathitt Gray wrote:
> On Fri, May 01, 2026 at 10:07:46PM +0200, Wadim Mueller wrote:
> > This series adds a new counter subsystem driver that implements
> > quadrature encoder position tracking using plain GPIO pins with
> > edge-triggered interrupts.
> >
> > The driver is intended for low to medium speed rotary encoders where
> > hardware counter peripherals (eQEP, FTM, etc.) are unavailable or
> > already in use. It targets the same use-cases as interrupt-cnt.c but
> > provides full quadrature decoding instead of simple pulse counting.
> >
> > Features:
> > - X1, X2, X4 quadrature decoding and pulse-direction mode
> > - Optional index signal for zero-reset
> > - Configurable ceiling (position clamping)
> > - Standard counter subsystem sysfs + chrdev interface
> > - Enable/disable via sysfs with IRQ gating
> >
> > Tested on TI AM64x (Cortex-A53) with a motor-driven rotary encoder
> > at up to 2 kHz quadrature edge rate.
>
> Hello Wadim,
>
> This is certainly a neat idea! :-) Several times I have wished for a
> convenient way to just plug in a quadrature encoder to the GPIO lines of
> my system and immediately start reading position data. However, I want
> to be sure this makes sense as a Counter subsystem driver before I
> proceed with a full review.
>
> If I understand correctly from my brief overview, the core approach in
> the gpio-quadrature-encoder module is to take two GPIO lines (A and B),
> setup interrupt service routines for them, compare their GPIO values on
> each interrupt, and respectively update a persistent count based on the
> quadrature relationship.
>
> From that description, I don't immediately see a need for this to occur
> in kernelspace. Couldn't the same design be accomplished effectively in
> userspace via the libgpiod API[^1]? I believe that library allows you
> to watch for GPIO edge events and request GPIO line values. (I'm CCing
> the GPIO subsystem maintainers in case I'm missing something obvious
> here.)
>
> Although the Counter subsystem does provide an established user
> interface for counter devices, I'm not sure that alone justifies a
> kernel driver when the same can be achieved by an equivalent userspace
> application. If you can argue for why this should exist in the kernel,
> I'll feel more comfortable with accepting the Counter subsystem as the
> right home for the gpio-quadrature-encoder module.
>
> > Changes in v3:
> > - Pick up Acked-by: Conor Dooley on the DT binding patch.
> > - No code changes.
>
> As an aside, you don't need to resend the patchset if there are no code
> changes, I'll make sure to pick up the tags in the mail threads when the
> patches are accepted. This helps reduce the amount the messages we need
> to parse on the mailing list.
>
> Thanks,
>
> William Breathitt Gray
>
> [^1] https://libgpiod.readthedocs.io/

Hi,

to give the discussion "why a kernel driver, gpiomon should be
enough?" some real numbers, i did a small benchmark on the AM64x
board where this driver was developed on. From my side the data
below looks like a kernel side counter is the right tool for this
job, but in the end this is of course your decision as a maintainter.
Please tell me wether i should send a v4 or rather drop the series,
so i know how to continue.

Setup
-----
SoC: TI AM64x, Cortex-A53 dual core, gpio-davinci
Kernel: 6.6.32, CONFIG_PREEMPT=y, CONFIG_HZ=250, no isolcpus, no RT
Source: EHRPWM driving one GPIO line (square wave)
Window: 2.0 s per point, 3 runs per point (mean +- stdev in the repo)
A: counter/gpio-quadrature-encoder, pulse-direction, B held low,
signal_a_action=rising-edge
B: gpiomon -c gpiochipN -e rising -F %o (libgpiod v2.1.2 CLI),
lines counted from the output file afterwards

Both ways are seeing the same physical edge.

Rising edge counts (mean error vs. expected, n=3)
-------------------------------------------------
f [Hz] kernel err% gpiomon err%
1000 -0.16 -0.13
10000 -0.42 -0.88
20000 -0.89 -3.18
50000 -2.37 -6.77
75002 -0.29 -10.93
100000 -0.65 -20.53
150000 n/a* -42.46
200000 n/a* -60.74

CPU cost at 75 kHz
------------------
task CPU sys irq+softirq (sum of 2 cores)
kernel: 0 % ~50 %
gpiomon: ~60 % ~25 %

Some interpretation
-------------------
- Below ~10 kHz: not really distinguishable from quantization
noise (one rising edge of phase ambiguity per measurement
window). So no meanigful difference between the two paths here.

- 20 - 100 kHz: gpiomon error grows more or less linear with the
rate while the kernel counter stays at the noise floor. At
75 kHz gpiomon is dropping roughly one event out of nine and
consums 60 % of one core in user/sys time, on top of the IRQ
work; the kernel counter looses ~0.3 % at zero task CPU.

- *at >= 150 kHz the davinci-gpio bank IRQ saturates at around
200k irq/s. This is a SoC limit, not a software stack limit, and
applies to both paths. The bench harness aborts the kernel sweep
on >30 % apparent loss to stay away from soft-lockups; the
gpiomon numbers above this point are only listed to show that
the same hardware ceiling costs the userspace path much more
useable counts, because of the poll/read latency on top of the
IRQ.

Honest caveats
--------------
- "gpiomon" here is the libgpiod reference CLI: single threaded,
writes one text line per event into a file. A hand written uAPI
v2 consumer with a tight read() loop and a binary buffer would
most likley be cheaper. The IRQ rate ceiling at ~200k irq/s on
this SoC is not affected from this. Patches against the bench
are very welcome.
- The driver is benchmarked in pulse-direction mode with B held
low, so one IRQ per edge. A real two channel quadrature source
would double the bank IRQ load accordingly.
- CONFIG_HZ=250, no PREEMPT_RT, no isolcpus. PREEMPT_RT and pinning
would soften the userspace cliff but does not change the per-edge
cost of the kernel path.

Reproducer, raw CSVs and plots:
https://github.com/wafgo/qenc-bench

(README.md has the full sweep and plots; data/aggregate.csv has the
table above with full precision.)

What i would like to know
-------------------------
My reading of the numbers is that on this kind of SoC a kernel side
edge counter is the only way to get correct counts at industrial
encoder rates without burning a whole core on a userspace listener,
and that the proposed driver does exactly fit this role. But like i
said, this is your subsystem and your call. So concretely:

- if you would like me to send a v4 (with whatever changes from
this round you want me to fold in) i am happy to do that;
- if you would rather not take the driver at all, please tell me
so and i will drop the series. I would just like to know either
way, so i can stop sitting on the branch.

Thanks for taking the time to look at it.