Re: [00/17] Large Blocksize Support V3
From: Eric W. Biederman
Date: Thu Apr 26 2007 - 15:36:48 EST
Jens Axboe <jens.axboe@xxxxxxxxxx> writes:
> On Thu, Apr 26 2007, Christoph Lameter wrote:
>> On Thu, 26 Apr 2007, Jens Axboe wrote:
>>
>> > On Thu, Apr 26 2007, Christoph Lameter wrote:
>> > > On Thu, 26 Apr 2007, Jens Axboe wrote:
>> > >
>> > > > The above can be implemented fairly cleanly, and on a need-to-have
>> > > > basis. It's not something that'll break drivers.
>> > >
>> > > But its also not going to fix the hacks that we have in the kernel
>> > > to deal with > PAGE_SIZE i/o.
>> >
>> > No, but that's a _seperate_ issue! Don't keep mixing up the two.
>>
>> Yes I understand that you want it to be a separate issue so we get get
>> more rationales for the hacks that we do to avoid the large
>> order allocations.
>
> Christoph, don't take your frustrations out on me. I've several times in
> this thread said that I'd LIKE to have > PAGE_SIZE support in the page
> cache. I WROTE the initial pktcdvd driver that is a primary example of
> these hacks, I'm very well aware of the pain and bugs involved with
> that.
>
> But don't push large pages as the only solution to larger ios, because
> that is trivially not true.
Just taking a quick look at pktcdvd that does appear to be a case where
we want to express to the rest of the system that our block device has
a > 2KB block size.
I think it would be very useful to express to the rest of the system
that yes indeed this device has a 32K sector size (or whatever the
real limit is) instead of hiding that fact away.
Now I'm not a block layer expert and my knowledge of the page cache
is rusty. So I can't immediately point to a way we can do this.
I'll guess that it will require cleaning up some old crufty code that
no one wants to touch.
I will point out that while this appears to require support from
upper levels it also does not require a larger physical page size
in the page cache.
Am I correct in assuming that the problem is primarily about getting
filesystems (and other upper layers) to submit BIOs that take into
consideration the larger block size of the underlying device, so
that read/modify write is not needed in the pktcdvd layer?
Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/