On Tue, Dec 27, 2022 at 12:30:35PM +0800, Jason Wang wrote:
If the goal is to avoid disrupting traffic while CVQ is in use,But device is still going and will later use the buffers.Avoiding infinite wait/poll is one of the goals, another is to sleep.
Same for timeout really.
If we think the timeout is hard, we can start from the wait.
Thanks
that sounds more reasonable. E.g. someone is turning on promisc,
a spike in CPU usage might be unwelcome.
things we should be careful to address then:
1- debugging. Currently it's easy to see a warning if CPU is stuck
in a loop for a while, and we also get a backtrace.
E.g. with this - how do we know who has the RTNL?
We need to integrate with kernel/watchdog.c for good results
and to make sure policy is consistent.
2- overhead. In a very common scenario when device is in hypervisor,
programming timers etc has a very high overhead, at bootup
lots of CVQ commands are run and slowing boot down is not nice.
let's poll for a bit before waiting?
3- suprise removal. need to wake up thread in some way. what about
other cases of device breakage - is there a chance this
introduces new bugs around that? at least enumerate them please.