Re: [PATCH v2] packet: track ring entry use using a shadow ring to prevent RX ring overrun
From: Willem de Bruijn
Date: Tue May 22 2018 - 10:48:25 EST
>>> I think the bigger issues as you've pointed out are the cost of
>>> the additional spin lock and should the additional state be
>>> stored in-band (fewer cache lines) or out-of band (less risk of
>>> breaking due to unpredictable application behavior).
>>
>> We don't need the spinlock if clearing the shadow byte after
>> setting the status to user.
>>
>> Worst case, user will set it back to kernel while the shadow
>> byte is not cleared yet and the next producer will drop a packet.
>> But next producers will make progress, so there is no deadlock
>> or corruption.
>
> I thought so too for a while but after spending more time than I
> care to admit I relized the following sequence was occuring:
>
> Core A Core B
> ------ ------
> - Enter spin_lock
> - Get tp_status of head (X)
> tp_status == 0
> - Check inuse
> inuse == 0
> - Allocate entry X
> advance head (X+1)
> set inuse=1
> - Exit spin_lock
>
> <very long delay>
>
> <allocate N-1 entries
> where N = size of ring>
>
> - Enter spin_lock
> - get tp_status of head (X+N)
> tp_status == 0 (but slot
> in use for X on core A)
>
> - write tp_status of <--- trouble!
> X = TP_STATUS_USER <--- trouble!
> - write inuse=0 <--- trouble!
>
> - Check inuse
> inuse == 0
> - Allocate entry X+N
> advance head (X+N+1)
> set inuse=1
> - Exit spin_lock
>
>
> At this point Core A just passed slot X to userspace with a
> packet and Core B has just been assigned slot X+N (same slot as
> X) for it's new packet. Both cores A and B end up filling in that
> slot. Tracking ths donw was one of the reasons it took me a
> while to produce these updated diffs.
Is this not just an ordering issue? Since inuse is set after tp_status,
it has to be tested first (and barriers are needed to avoid reordering).