Re: [PATCH 4/4] input: evdev: only wake poll on EV_SYN

From: Jeffrey Brown
Date: Mon Mar 28 2011 - 04:55:22 EST


Hi Dmitry,

On Sun, Mar 27, 2011 at 11:12 PM, Dmitry Torokhov
<dmitry.torokhov@xxxxxxxxx> wrote:
>>     count = read(fd, buffer, sizeof(buffer) / sizeof(buffer[0]));
>
> I hope this is simply a typo in pseudo-code - read takes size in bytes,
> not in number of structures.

Definitely a typo. :)

> Unfortunately your change fixes only first packet, like I mentioned.
> Consider the following scenario:
>
>  - input core delivers events, we postpone waking up waiters
>   till we get EV_SYN/SYN_REPORT;
>  - userspace is waken and consumes entire packet;
>  - in the meantime input core delivered 3 more events;
>  - userpsace executes poll;
>  - kernel adds the process to poll waiters list (poll_wait() call in
>   evdev_poll();
>  - evdev_poll() checks the condition, sees that there are events and
>   signals that the data is ready even though we did not accumulate
>   full event packet.
>
> Hence your fix did not reliably fix the issue you are seeing.

Ahh, I see what you mean. As long as the buffer is non-empty, poll()
considers the stream to be readable therefore it does not block.
That's a good thing as otherwise clients that poll() / read() one
event at a time would be broken.

Delaying when we wake waiters is helpful in the common case but as you
point out there still exists a potential degenerate case if the writer
is busy and prevents the reader from ever catching up completely and
blocking when the buffer is empty. I haven't seen that happen but
it's certainly possible. In any case, it would be no worse than what
we have now.

> We might entertain notion of not considering device readable unless
> there is a sync event that has not been consumed, but this is
> significant change in semantics and we need much more consideration.

Indeed. That's why I brought the idea here for discussion. :)

For maximum compatibility we could define an ioctl to enable new
readability semantics. Existing clients would retain the old behavior
(device is readable whenever it is non-empty). I'm not sure this is
really necessary but it might be useful to help diagnose bad drivers
that never write sync packets.

Suppose we adopt the invariant that when new readability semantics are
enabled, clients are only allowed to read events that belong to
complete packets. That is, they can read events one at a time or in
batches all they like but only up to and including the last sync
event.

Implementing this behavior is straightforward.

Keep track of the end index one past the last readable event in the
ring buffer. The end index marks the end of the readable portion of
the buffer. Initially the read index and end index are the same
(zero). The read index is never allowed to advance past the end
index. When the read index equals the end index, the device is not
readable. (Equivalently, the device is only readable when the buffer
contains at least one sync event that has not yet been read.) When a
sync event (other than SYN_MT_REPORT) is written, advance the end
index past the sync event so that the entire packet becomes readable,
then wake up the waiters because readability may have changed.

How's that sound?

Thanks,
Jeff.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/