Hi Wei,
On 06/12/2018 07:05 AM, Wei Wang wrote:
On 06/12/2018 09:59 AM, Linus Torvalds wrote:I have been working on a similar series [1] that is more generic, which
On Mon, Jun 11, 2018 at 6:36 PM Michael S. Tsirkin <mst@xxxxxxxxxx>OK, I will implement a new version based on the suggestions. Thanks.
wrote:
Maybe it will help to have GFP_NONE which will make any allocationIt would definitely have helped me initially overlook that call chain.
fail if attempted. Linus, would this address your comment?
But then when I started looking at the whole dma_map_page() thing, it
just raised my hackles again.
I would seriously suggest having a much simpler version for the "no
allocation, no dma mapping" case, so that it's *obvious* that that
never happens.
So instead of having virtio_balloon_send_free_pages() call a really
generic complex chain of functions that in _some_ cases can do memory
allocation, why isn't there a short-circuited "vitruque_add_datum()"
that is guaranteed to never do anything like that?
Honestly, I look at "add_one_sg()" and it really doesn't make me
happy. It looks hacky as hell. If I read the code right, you're really
trying to just queue up a simple tuple of <pfn,len>, except you encode
it as a page pointer in order to play games with the SG logic, and
then you hmap that to the ring, except in this case it's all a fake
ring that just adds the cpu-physical address instead.
And to figuer that out, it's like five layers of indirection through
different helper functions that *can* do more generic things but in
this case don't.
And you do all of this from a core VM callback function with some
_really_ core VM locks held.
That makes no sense to me.
How about this:
- get rid of all that code
- make the core VM callback save the "these are the free memory
regions" in a fixed and limited array. One that DOES JUST THAT. No
crazy "SG IO dma-mapping function crap". Just a plain array of a fixed
size, pre-allocated for that virtio instance.
- make it obvious that what you do in that sequence is ten
instructions and no allocations ("Look ma, I wrote a value to an array
and incremented the array idex, and I'M DONE")
- then in that workqueue entry that you start *anyway*, you empty the
array and do all the crazy virtio stuff.
In fact, while at it, just simplify the VM interface too. Instead of
traversing a random number of buddy lists, just trraverse *one* - the
top-level one. Are you seriously ever going to shrink or mark
read-only anythin *but* something big enough to be in the maximum
order?
MAX_ORDER is what, 11? So we're talking 8MB blocks. Do you *really*
want the balloon code to work on smaller things, particularly since
the whole interface is fundamentally racy and opportunistic to begin
with?
solves the problem of giving unused memory back to the host and could be
used to solve the migration problem as well. Can you take a look and see
if you can use my series in some way?