[PATCH RFC v2 0/6] ext4: yet another project quota

From: Konstantin Khlebnikov
Date: Tue Mar 10 2015 - 13:30:30 EST

Projects quota allows to enforce disk quota for several subtrees or even
individual files on the filesystem. Each inode is marked with project-id
(independently from uid and gid) and accounted into corresponding project
quota. New files inherits project id from directory where they are created.

This is must-have feature for deploying lightweight containers.
Also project quota can tell size of subtree without doing recursive 'du'.

This patchset adds project id and quota into ext4.

This time I've prepared patches also for e2fsprogs and quota-tools.

All patches are available at github:
https://github.com/koct9i/linux --branch project
https://github.com/koct9i/e2fsprogs --branch project
https://github.com/koct9i/quota-tools --branch project


Porposed behavior is similar to project quota in XFS:

* all inode are marked with project id
* new files inherits project id from parent directory
* project quota accounts inodes and enforces limits
* cross-project link and rename operations are restricted


There is no flag similar to XFS_XFLAG_PROJINHERIT (which allows to disable
project id inheritance), instead of that project which userspace sees as '0'
(in nested user-name space that might be actually non-zero project) acts as
default project where restrictions for link/rename are ignored.
(also see below, in "why new ioctl" part)

This implementation adds shortcut for moving files from one project into
another: non-directory inodes with n_link == 1 are transferred without
copying (XFS in this case returns -EXDEV and userspace have to copy file).

In XFS file owner (or process with CAP_FOWNER) can set set any project id,
XFS permits changing project id only from init user-namespace.

This patchset adds sysctl fs.protected_projects. By default it's 0 and project
id acts as XFS project. Setting it to 1 makes chaning project id priviliged
operation which requires CAP_SYS_RESOURCE in current user-namespace, changing
project id mapping for nested user-namespace also requires that capability.
Thus there are two levels of control: project id mapping in user-ns defines set
of permitted projects and capability protects operations within this set.

I see no problems with supporting all this in XFS, all difference in interface.

Ext4 layout

Project id introduce ro-compatible feature 'project'.

Inode project id is stored in place of obsolete field 'i_faddr' (that trick was
suggested by Darrick J. Wong in previous discussions of project quota).
Linux never used that field and present fsck checks that it contains zero.

Quota information is stored in special inode â11 (by default only 10 inodes are
reserved for special usage, I've add option to resize2fs to reserve more).
(see e2fsprogs patches for details) For symmetry with other quotas inode number
is stored in superblock.

Project quota supports only modern 'hidden' journaled mode.


Interface for changing limits / query current usage is common vfs quotactl()
where quotatype = PRJQUOTA = 2. User can query current state of any project
mapped into user-ns, changing of limits requires CAP_SYS_ADMIN in init user-ns.

Two new ioctls for getting / changing inode project id:
int ioctl(fd, FS_IOC_GETPROJECT, unsigned *project);
int ioctl(fd, FS_IOC_SETPROJECT, unsigned *project);

They acts as interface for super-block methods get_project / set_project
Generic code checks permissions, does project id translation in user-namespace
mapping, grabs write-access to the filesystem, locks i_mutex for set opetaion.
Filesystem method only updates inode and transfers project quota.

No new mount options added. Disk usage tracking is enabled at mount.
Limits are enabeld later by "quotaon".

(BTW why journalled quota doesn't enable limits right at the time of mounting?)

Why new ioctls?

XFS has it's own interface for that: XFS_IOC_FSGETXATTR / XFS_IOC_FSSETXATTR.
But it has several flaws and doesn't fit for a role of generic interface.

It contains a lot of xfs-specific flags and racy by design: set operation
commits all fields at once thus it's used in sequence get-change-set without
any lock, Concurrent updates from user space will collide.

Also xfs has flag XFS_XFLAG_PROJINHERIT which tells should new files inherit
project id from parent directory or not. This flag is barely useful and only
makes everything complicated. Even tools in xfsprogs don't use it: they always
set it together with project id and clears when set project id back to zero.

And the main reason: this compatibility gives nothing. The only user of xfs
ioctl which I've found is the xfsprogs. And these tools check filesystem name
and don't work anywhere except 'xfs'.


[1] 2014-12-09 ext4: add project quota support by Li Xi

[2] 2014-01-28 A draft for making ext4 support project quota by Zheng Liu

[3] 2012-07-09 introduce extended inode owner identifier v10 by Dmitry Monakhov

[4] 2010-02-08 Introduce subtree quota support by Dmitry Monakhov


Konstantin Khlebnikov (6):
fs: vfs ioctls for managing project id
fs: protected project id
quota: generic project quota
ext4: support project id and project quota
ext4: add shortcut for moving files across projects
ext4: mangle statfs results accourding to project quota usage and limits

Documentation/filesystems/Locking | 4 +
Documentation/filesystems/vfs.txt | 8 +++
Documentation/sysctl/fs.txt | 16 ++++++
fs/compat_ioctl.c | 2 +
fs/ext4/ext4.h | 15 ++++-
fs/ext4/ialloc.c | 3 +
fs/ext4/inode.c | 15 +++++
fs/ext4/namei.c | 102 ++++++++++++++++++++++++++++++++++++-
fs/ext4/super.c | 61 ++++++++++++++++++++--
fs/ioctl.c | 62 ++++++++++++++++++++++
fs/quota/dquot.c | 96 +++++++++++++++++++++++++++++++++--
fs/quota/quota.c | 8 ++-
fs/quota/quotaio_v2.h | 6 +-
include/linux/fs.h | 3 +
include/linux/quota.h | 1
include/linux/quotaops.h | 16 ++++++
include/uapi/linux/capability.h | 1
include/uapi/linux/fs.h | 3 +
include/uapi/linux/quota.h | 6 +-
kernel/sysctl.c | 9 +++
kernel/user_namespace.c | 4 +
21 files changed, 416 insertions(+), 25 deletions(-)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/