Re: [PATCH v2 4/7] sysfs: Add SYSFS_HUGE_BIN_FILE flag for binary attributes larger than PAGE_SIZE

From: K Prateek Nayak

Date: Wed May 13 2026 - 00:00:00 EST


Hello Ilpo,

On 5/12/2026 5:14 PM, Ilpo Järvinen wrote:
>> Historically, sysfs read buffers were allocated with get_zeroed_page(),
>> limiting reads to PAGE_SIZE. Commit 13c589d5b0ac ("sysfs: use seq_file
>> when reading regular files") transitioned regular (text) attribute reads
>> to seq_file, which can dynamically grow buffers beyond PAGE_SIZE.
>
>
>> However, the PAGE_SIZE limit was intentionally preserved for
>> compatibility. When binary attribute handling was later unified into
>> the same codebase, the non-seq_file read path (kernfs_file_read_iter)
>> retained this PAGE_SIZE cap for binary files as well.
>
> I tried to investigate these claims but with the lack of references,
> I didn't get very far. At least the thread where 13c589d5b0ac came from
> didn't seem to clearly say the things claimed here (assuming I managed
> to find all its emails from the archives).

For most part I looked at the code that existed at the time of
13c589d5b0ac and now.

Prior to that commit fill_read_buffer() was the read function which used
get_zeroed_page() for buffering. This is also the reason we have these
defensive bits in the current sysfs_kf_seq_show():

if (count >= (ssize_t)PAGE_SIZE) {
...
/* Try to struggle along */
count = PAGE_SIZE - 1;
}

Also see commit 815d2d50da41 ("driver core: debug for bad
dev_attr_show() return value.") which added that printk() for debugging
the violator of PAGE_SIZE constraints back in the days.

Once sysfs had a seq_file path, the seq_file side handled buffering and
it would do so by calling ->read() in a loop while increasing the
seq_iter buffer size by a scale of 2 each time the content wouldn't fit
in the given buffer.

This is also the reason we have a:

count = seq_get_buf(sf, &buf);
if (count < PAGE_SIZE) {
seq_commit(sf, -1);
return 0;
}

which ensures we have a buffer worth PAGE_SIZE before calling the read
function, else, we spoof a overflow and let seq_file bits give us a
bigger buffer when we try to read.

* All snippets are from sysfs_kf_seq_show() in fs/sysfs/file.c

>
>> Drivers that expose binary attributes larger than PAGE_SIZE — such as
>> the AMD HSMP metric table (~13 KB) — cannot deliver the full content
>> in a single read() call through the existing path.
>>
>> Introduce a new opt-in flag SYSFS_HUGE_BIN_FILE (040000)
>
>> that drivers can OR into their bin_attribute mode.
>
> Simplify to:
>
> for bin_attribute mode.
>
> ?

Ack!

>
>> When set, sysfs selects a new
>> kernfs_ops (sysfs_bin_kfops_huge_file_ro) whose .seq_show callback
>> pipes the bin_attribute ->read() result through seq_file, allowing
>> reads of arbitrary size in one shot. Existing binary attributes
>> without the flag continue using the legacy capped path.
>
> I suggest you avoid using "legacy" as a term for anything that is in use
> in any way or still exists. I've seen people to jump on that particular
> word enough times, it can sidetrack discussions.

Sorry about that! We'll just refer to it as the default / current
behavior henceforth.

Thanks a ton for taking a look at the series! Much appreciated.

--
Thanks and Regards,
Prateek