Re: [RFC PATCH] net, tun: remove the flow cache

From: Zhi Yong Wu
Date: Tue Dec 17 2013 - 21:08:33 EST


On Tue, Dec 17, 2013 at 6:05 PM, Jason Wang <jasowang@xxxxxxxxxx> wrote:
> On 12/17/2013 05:13 PM, Zhi Yong Wu wrote:
>> On Tue, Dec 17, 2013 at 4:49 PM, Jason Wang <jasowang@xxxxxxxxxx> wrote:
>>> > On 12/17/2013 03:26 PM, Zhi Yong Wu wrote:
>>>> >> From: Zhi Yong Wu <wuzhy@xxxxxxxxxxxxxxxxxx>
>>>> >>
>>>> >> The flow cache is an extremely broken concept, and it usually brings up
>>>> >> growth issues and DoS attacks, so this patch is trying to remove it from
>>>> >> the tuntap driver, and insteadly use a simpler way for its flow control.
>>> >
>>> > NACK.
>>> >
>>> > This single function revert does not make sense to me. Since:
>> IIRC, the tuntap flow cache is only used to save the mapping of skb
>> packet <-> queue index. My idea only save the queue index in skb_buff
>> early when skb buffer is filled, not in flow cache as the current
>> code. This method is actually more simpler and completely doesn't need
>> any flow cache.
>
> Nope. Flow caches record the flow to queues mapping like what most
> multiqueue nic does. The only difference is tun record it silently while
> most nic needs driver to tell the mapping.
Just check virtio specs, i seem to miss the fact that flow cache
enable packet steering in mq mode, thanks for your comments. But i
have some concerns about some of your comments.
>
> What your patch does is:
> - set the queue mapping of skb during tun_get_user(). But most drivers
> using XPS or processor id to select the real tx queue. So the real txq
> depends on the cpu that vhost or qemu is running. This setting does not
Doesn't those drivers invoke netdev_pick_tx() or its counterpart to
select real tx queue? e.g. tun_select_queue(). or can you say it with
an example?
Moreover, how do those drivers know which cpu vhost or qemu is running on?
> have any effect in fact.
> - the queue mapping of skb were fetched during tun_select_queue(). This
> value is usually set by a multiqueue nic to record which hardware rxq
> was this packet came.
ah? Can you let me know where a mq nic controller set it?
>
> Can you explain how your patch works exactly?
You have understood it.
>>> >
>>> > - You in fact removes the flow steering function in tun. We definitely
>>> > need something like this to unbreak the TCP performance in a multiqueue
>> I don't think it will downgrade the TCP perf even in mq guest, but my
>> idea maybe has better TCP perf, because it doesn't have any cache
>> table lookup, etc.
>
> Did you test and compare the performance numbers? Did you run profiler
> to see how much does the lookup cost?
No, As i jus said above, i miss that flow cache can enable packet
steering. But Did you do related perf testing? To be honest, i am
wondering how much perf the packet steering can improve. Actually it
also injects a lot of cache lookup cost.
>>> > guest. Please have a look at the virtio-net driver / virtio sepc for
>>> > more information.
>>> > - The total number of flow caches were limited to 4096, so there's no
>>> > DoS or growth issue.
>> Can you check why the ipv4 routing cache is removed? maybe i miss
>> something, if yes, pls correct me. :)
>
> The main differences is that the flow caches were best effort. Tun can
> not store all flows to queue mapping, and even a hardware nic can not do
> this. If a packet misses the flow cache, it's safe to distribute it
> randomly or through another method. So the limitation just work.
Exactly, we can know this from tun_select_queue().
>
> Could you please explain the DoS or growth issue you meet here?
>>> > - The only issue is scalability, but fixing this is not easy. We can
>>> > just use arrays/indirection table like RSS instead of hash buckets, it
>>> > saves some time in linear search but has other issues like collision
>>> > - I've also had a RFC of using aRFS in the past, it also has several
>>> > drawbacks such as busy looping in the networking hotspot.
>>> >
>>> > So in conclusion, we need flow steering in tun, just removing current
>>> > method does not help. The proper way is to expose several different
>>> > methods to user and let user to choose the preferable mechanism like
>>> > packet.
>> By the way, let us look at what other networking guys think of this,
>> such as MST, dave, etc. :)
>>
>
> Of course.



--
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/