Re: Suggested new user link command

From: Eric W. Biederman
Date: Sat May 05 2018 - 01:10:36 EST


Tony Wallace <tony@xxxxxxxxxxx> writes:

> On 02/05/18 01:35, Bernd Petrovitsch wrote:
>> Hi all!
>>
>> Top-quoting is evil BTW.
>>
>> On Wed, 2018-05-02 at 00:17 +1200, Tony Wallace wrote:
>>> Two issues here:
>>> 1) Use case (which I have)
>>> 2) Permissions
>>>
>>> 1) Use case
>>>
>>> I am trying to build a backup system. To avoid duplication of files
>>> over multiple backups I take an Md5 check sum of file contents. Files
>>> with the same sum are hardlinked together. Files are linked in to a
>>> standard directory structure a new link for each backup that the file is
>>> part of. When all backups pointing to a file are deleted the reference
>>> count drops to zero and the file is deleted. We can keep a database of
>>> checksums and there related inode numbers for linking purposes. So why
>> a) You can store one of the filenames instead of the inode number.
>> b) You can keep an extra directory with a hardlink named as the inode
>> number (and delete the entries there if the link count drops to 1).
>>
>>> not have some reference copy to link against it would take no extra
>>> space. Well it doesn't, but it keeps at least one copy of the file on
>> You have a (disk) space problems on an backup system?
>> I don't think so, Tim;-)
>>
>>> disk forever and the reference count never drops to zero. Using one of
>>> the backup copies to link to (as stored as the reference copy in the
>>> database) will not work as it could be deleted at any time.
>>>
>>> I have seen on stack overflow others wanting to do this also.
>> "Do. Or do not. There is no try." - Yoda
>> SCNR .....
>>
>>> 2) Permissions
>>>
>>> To maintain security there are two requirements:
>>> 2.1) The effective user must have rights to the inode, that is they must
>>> either own it or be root
>>> 2.2) The effective user must have rights file creation rights to the
>>> directory where it is being linked
>> Obviously (und useful). And on a backup system, there is no problem
>> about that (because the backup software probably runs as root anyways
>> because otherwise 2.1) below will limit the deduplication severely).
>>
>> But for a (to be mainlined/accepted) new syscall, one should think
>> about all situations/use cases and not just one.
>>
>> Additionally to the 2 items above, one needs also x-permissions on
>> *all* directories from / to one existing hardlink in the traditional
>> case and such a syscall bypasses that.
>> Think about it: Everyone can write a progrm to try link all inodes from
>> 0 to ~0 to a directory entry and gets all files with restrictions 2.1)
>> and 2.2) from below.
>> ATM it is enough to `chmod o= ~` to keep all others from all files in
>> my $HOME. Afterwards it's no longer that easy.
>>
>>> If you say no, that is fine, but I do think this idea has merit and can
>>> be done without compromising the system.
>> I'm no one to say no (or yes;-) here to anything;-) I'm just thinking
>> about the implications.
>>
>> And you can always implement a patch and if it's ignored/not accepted,
>> you can use it locally anyways - no one can stop that:-)
>>
>> One more - more constructive - thing: Perhaps it is more
>> acceptable/useful if there is a mount option which must be activated on
>> the backup filesystems and that is not activated anywhere else.
>>
>> MfG,
>> Bernd
>
> I want to thank everyone for their time. I have taken note of your
> comments. I believe that there is the need for a companion command
> istat that obtains the stat data from an inode. Istat may be useful in
> constructing ilink. For my proposed use case complexity is minimised,
> and effectiveness is maximised by making both istat and ilink root only
> system calls, and then doing my backup as root. I do not know how a
> mount option would work, and for my own use it is again probably
> unnecessary complexity, but accept it may be necessary if released more
> generally.
>
> I will be dropping the matter now, at least until I have some code to
> show, but if anyone has any more thoughts feel free to drop me an
> email.Â

Actually the functionality you are looking for has in some sense already
been implemented, and in a way that does not assume a strictly posix
filesystem.

The system calls are:
name_to_handle_at
open_by_handle_at

Good luck,
Eric