Re: [PATCH v2] erofs: fix incorrect symlink detection in fast symlink

From: Gao Xiang
Date: Tue Sep 10 2024 - 22:27:55 EST




On 2024/9/11 04:51, Colin Walters wrote:


On Mon, Sep 9, 2024, at 10:18 PM, Gao Xiang wrote:

I know you ask for an explicit check on symlink i_size, but
I've explained the current kernel behavior:
- For symlink i_size < PAGE_SIZE (always >= 4096 on Linux),
it behaves normally for EROFS Linux implementation;

- For symlink i_size >= PAGE_SIZE, EROFS Linux
implementation will mark '\0' at PAGE_SIZE - 1 in
page_get_link() -> nd_terminate_link() so the behavior is also
deterministic and not harmful to the system stability and security;

Got it, OK.

In other words, currently i_size >= PAGE_SIZE is an undefined behavior
but Linux just truncates the link path.

I think where we had a miscommunication is that when I see "undefined behavior" I thought you were using the formal term: https://en.wikipedia.org/wiki/Undefined_behavior

The term for what you're talking about in my experience is usually "unspecified behavior" or "implementation defined behavior" which (assuming a reasonable implementor) would include silent truncation or an explicit error, but *not* walking off the end of a buffer and writing to arbitrary other kernel memory etc.

Yeah, agreed. "implementation defined behavior" sounds a better term.

Sorry about my limited English corpus, because the environment I'm
living mostly is used to professional terms translated in Chinese..


(Hmm really given the widespread use of nd_terminate_link I guess this is kind of more of a "Linux convention" than just an EROFS one, with XFS as a notable exception?)

I'm not sure if other kernel fses have their own internal issues
(so they need to check i_size > PAGE_SIZE to cover up their own
format design in advance), but I think (and tested with crafted
images) EROFS with pure only Linux VFS nd_terminate_link()
implementation (since 2.6.x era)
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ebd09abbd9699f328165aee50a070403fbf55a37

is already safe on i_size > PAGE_SIZE since EROFS symlink on-disk
format is just like its regular inode format.

As for XFS, I think it's a history on-disk behavior (1024-byte
hard limitation) so they have to follow until now, see the related
commit message:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6eb0b8df9f74f33d1a69100117630a7a87a9cc96


For this case, to be clear I'm totally fine with the limitation,
but I need to decide whether I should make "EROFS_SYMLINK_MAXLEN"
as 4095 or "EROFS_SYMLINK_MAXLEN" as 4096 but also accepts
`link[4095] == '\0'`.

Mmmm...I think PATH_MAX is conventionally taken to include the NUL; yeah see
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/namei.c?id=b40c8e7a033ff2cafd33adbe50e2a516f88fa223#n123

Agreed, but honestly I have some concern if some OS or tar format
or other popular archive formats support large symlinks but EROFS
have no way to keep them due to on-disk limitation.

If you don't have some strong opinion on this, I do hope let's
hold off our decision about this to ensure compatibility.

Thanks,
Gao Xiang