Re: [PATCH 0/3]HTLB mapping for drivers (take 2)

From: Eric B Munson
Date: Wed Aug 19 2009 - 06:35:57 EST

Next message: Mel Gorman: "Re: abnormal OOM killer message"
Previous message: DDD: "[PATCH] netpoll: WARN_ONCE for start_xmit returns with interruptsenabled"
In reply to: Mel Gorman: "Re: [PATCH 0/3]HTLB mapping for drivers (take 2)"
Next in thread: Alexey Korolev: "Re: [PATCH 0/3]HTLB mapping for drivers (take 2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Aug 19, 2009 at 11:05 AM, Mel Gorman<mel@xxxxxxxxx> wrote:
> On Wed, Aug 19, 2009 at 05:48:11PM +1200, Alexey Korolev wrote:
>> Hi,
>> >
>> > It sounds like this patch set working towards the same goal as my
>> > MAP_HUGETLB set. The only difference I see is you allocate huge page
>> > at a time and (if I am understanding the patch) fault the page in
>> > immediately, where MAP_HUGETLB only faults pages as needed. Does the
>> > MAP_HUGETLB patch set provide the functionality that you need, and if
>> > not, what can be done to provide what you need?
>> >
>>
>> Thanks a lot for willing to help. I'll be much appreciate if you have
>> an interesting idea how HTLB mapping for drivers can be done.
>>
>> It is better to describe use case in order to make it clear what needs
>> to be done.
>> Driver provides mapping of device DMA buffers to user level
>> applications.
>
> Ok, so the buffer is in normal memory. When mmap() is called, the buffer
> is already populated by data DMA'd from the device. That scenario rules out
> calling mmap(MAP_ANONYMOUS|MAP_HUGETLB) because userspace has access to the
> buffer before it is populated by data from the device.
>
> However, it does not rule out mmap(MAP_ANONYMOUS|MAP_HUGETLB) when userspace
> is responsible for populating a buffer for sending to a device. i.e. whether it
> is suitable or not depends on when the buffer is populated and who is doing it.
>
>> User level applications process the data.
>> Device is using a master DMA to send data to the user buffer, buffer
>> size can be >1GB and performance is very important. (So huge pages
>> mapping really makes sense.)
>>
>
> Ok, so the DMA may be faster because you have to do less scatter/gather
> and can DMA in larger chunks and and reading from userspace may be faster
> because there is less translation overhead. Right?
>
>> In addition we have to mention that:
>> 1. It is hard for user to tell how much huge pages needs to be
>> reserved by the driver.
>
> I think you have this problem either way. If the buffer is allocated and
> populated before mmap(), then the driver is going to have to guess how many
> pages it needs. If the DMA occurs as a result of mmap(), it's easier because
> you know the number of huge pages to be reserved at that point and you have
> the option of falling back to small pages if necessary.
>
>> 2. Devices add constrains on memory regions. For example it needs to
>> be contiguous with in the physical address space. It is necessary to
>> have ability to specify special gfp flags.
>
> The contiguity constraints are the same for huge pages. Do you mean there
> are zone restrictions? If so, the hugetlbfs_file_setup() function could be
> extended to specify a GFP mask that is used for the allocation of hugepages
> and associated with the hugetlbfs inode. Right now, there is a htlb_alloc_mask
> mask that is applied to some additional flags so htlb_alloc_mask would be
> the default mask unless otherwise specified.
>
>> 3 The HW needs to access physical memory before the user level
>> software can access it. (Hugetlbfs picks up pages on page fault from
>> pool).
>> It means memory allocation needs to be driven by device driver.
>>
>
> How about;
>
> o Extend Eric's helper slightly to take a GFP mask that is
> associated with the inode and used for allocations from
> outside the hugepage pool
> o A helper that returns the page at a given offset within
> a hugetlbfs file for population before the page has been
> faulted.
>
> I know this is a bit hand-wavy, but it would allow significant sharing
> of the existing code and remove much of the hugetlbfs-awareness from
> your current driver.
>
>> Original idea was: create hugetlbfs file which has common mapping with
>> device file. Allocate memory. Populate page cache of hugetlbfs file
>> with allocated pages.
>> When fault occurs, page will be taken from page cache and then
>> remapped to user space by hugetlbfs.
>>
>> Another possible approach is described here:
>> http://marc.info/?l=linux-mm&m=125065257431410&w=2
>> But currently not sure will it work or not.
>>
>>
>> Thanks,
>> Alexey
>>
>
> --
> Mel Gorman
> Part-time Phd Student Linux Technology Center
> University of Limerick IBM Dublin Software Lab
>

Alexey,

I'd be willing to take a stab at a prototype of Mel's suggestion based
on my patch set if you this it would be useful to you.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Mel Gorman: "Re: abnormal OOM killer message"
Previous message: DDD: "[PATCH] netpoll: WARN_ONCE for start_xmit returns with interruptsenabled"
In reply to: Mel Gorman: "Re: [PATCH 0/3]HTLB mapping for drivers (take 2)"
Next in thread: Alexey Korolev: "Re: [PATCH 0/3]HTLB mapping for drivers (take 2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]