Re: Suggested new user link command

From: Tony Wallace
Date: Tue May 01 2018 - 14:42:13 EST


On 02/05/18 01:35, Bernd Petrovitsch wrote:
> Hi all!
>
> Top-quoting is evil BTW.
>
> On Wed, 2018-05-02 at 00:17 +1200, Tony Wallace wrote:
>> Two issues here:
>> 1) Use case (which I have)
>> 2) Permissions
>>
>> 1) Use case
>>
>> I am trying to build a backup system. To avoid duplication of files
>> over multiple backups I take an Md5 check sum of file contents. Files
>> with the same sum are hardlinked together. Files are linked in to a
>> standard directory structure a new link for each backup that the file is
>> part of. When all backups pointing to a file are deleted the reference
>> count drops to zero and the file is deleted. We can keep a database of
>> checksums and there related inode numbers for linking purposes. So why
> a) You can store one of the filenames instead of the inode number.
> b) You can keep an extra directory with a hardlink named as the inode
> number (and delete the entries there if the link count drops to 1).
>
>> not have some reference copy to link against it would take no extra
>> space. Well it doesn't, but it keeps at least one copy of the file on
> You have a (disk) space problems on an backup system?
> I don't think so, Tim;-)
>
>> disk forever and the reference count never drops to zero. Using one of
>> the backup copies to link to (as stored as the reference copy in the
>> database) will not work as it could be deleted at any time.
>>
>> I have seen on stack overflow others wanting to do this also.
> "Do. Or do not. There is no try." - Yoda
> SCNR .....
>
>> 2) Permissions
>>
>> To maintain security there are two requirements:
>> 2.1) The effective user must have rights to the inode, that is they must
>> either own it or be root
>> 2.2) The effective user must have rights file creation rights to the
>> directory where it is being linked
> Obviously (und useful). And on a backup system, there is no problem
> about that (because the backup software probably runs as root anyways
> because otherwise 2.1) below will limit the deduplication severely).
>
> But for a (to be mainlined/accepted) new syscall, one should think
> about all situations/use cases and not just one.
>
> Additionally to the 2 items above, one needs also x-permissions on
> *all* directories from / to one existing hardlink in the traditional
> case and such a syscall bypasses that.
> Think about it: Everyone can write a progrm to try link all inodes from
> 0 to ~0 to a directory entry and gets all files with restrictions 2.1)
> and 2.2) from below.
> ATM it is enough to `chmod o= ~` to keep all others from all files in
> my $HOME. Afterwards it's no longer that easy.
>
>> If you say no, that is fine, but I do think this idea has merit and can
>> be done without compromising the system.
> I'm no one to say no (or yes;-) here to anything;-) I'm just thinking
> about the implications.
>
> And you can always implement a patch and if it's ignored/not accepted,
> you can use it locally anyways - no one can stop that:-)
>
> One more - more constructive - thing: Perhaps it is more
> acceptable/useful if there is a mount option which must be activated on
> the backup filesystems and that is not activated anywhere else.
>
> MfG,
> Bernd

I want to thank everyone for their time. I have taken note of your
comments. I believe that there is the need for a companion command
istat that obtains the stat data from an inode. Istat may be useful in
constructing ilink. For my proposed use case complexity is minimised,
and effectiveness is maximised by making both istat and ilink root only
system calls, and then doing my backup as root. I do not know how a
mount option would work, and for my own use it is again probably
unnecessary complexity, but accept it may be necessary if released more
generally.

I will be dropping the matter now, at least until I have some code to
show, but if anyone has any more thoughts feel free to drop me an email.Â

MfG

Tony