Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

From: Libor Michalek
Date: Mon Apr 25 2005 - 22:58:38 EST

Next message: Linus Torvalds: "Re: Mercurial 0.3 vs git benchmarks"
Previous message: Benjamin Herrenschmidt: "Re: [PATCH] PCI: Add pci shutdown ability"
In reply to: Andrew Morton: "Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspaceverbs implementation"
Next in thread: Roland Dreier: "Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Apr 25, 2005 at 04:24:05PM -0700, Andrew Morton wrote:
> Libor Michalek <libor@xxxxxxxxxxx> wrote:
> > On Mon, Apr 25, 2005 at 03:35:42PM -0700, Andrew Morton wrote:
> >
> > > Yes, we expect that all the pages which get_user_pages() pinned
> > > will become unpinned within the context of the syscall which pinned
> > > the pages. Or shortly after, in the case of async I/O.
> >
> > When a network protocol is making use of async I/O the amount of time
> > between posting the read request and getting the completion for that
> > request is unbounded since it depends on the other half of the connection
> > sending some data. In this case the buffer that was pinned during the
> > io_submit() may be pinned, and holding the pages, for a long time.
>
> Sure.
>
> > During
> > this time the process might fork, at this point any data received will be
> > placed into the wrong spot.
>
> Well the data is placed in _a_ spot. That's only the "wrong" spot because
> you've defined it to be wrong!
>
> IOW: what behaviour are you actually looking for here, and why, and does it
> matter?

For example a network server app has an open connection on which it
uses async IO to submit two buffers for a read operation. Both buffers
are pinned using get_user_pages() and the connection waits for data to
arrive. The connection received data, it is written into the first buffer,
the app is notified using async IO, and it retreives the async IO
completion. The app reads the buffer which happens to contain a command
to spawn a child, the app forks a child. Now there is still a buffer
posted for read and if more data arrives on the connection that data is
copied to the pages which were saved when the buffer was pinned. The app
is notified, retrieves the async IO completion, but when it goes to read
that buffer it will not have the new data.

> > > This is because there is no file descriptor or anything else associated
> > > with the pages which permits the kernel to clean stuff up on unclean
> > > application exit. Also there are the obvious issues with permitting
> > > pinning of unbounded amounts of memory.
> >
> > Correct, the driver must be able to determine that the process has died
> > and clean up after it, so the pinned region in most implementations is
> > associated with an open file descriptor.
>
> How is that association created?

The kernel module which pinned the memory is responsible for unpinning
it if the file descriptor, which was used to deliver the command that
resulted in the pinning, is closed.

-Libor

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Linus Torvalds: "Re: Mercurial 0.3 vs git benchmarks"
Previous message: Benjamin Herrenschmidt: "Re: [PATCH] PCI: Add pci shutdown ability"
In reply to: Andrew Morton: "Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspaceverbs implementation"
Next in thread: Roland Dreier: "Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]