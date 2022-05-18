Kernel Articles in LWN (Released From Paywall Moments Ago)
splice() and the ghost of set_fs() [LWN.net]
The normal rule of kernel development is that the creation of user-space regressions is not allowed; a patch that breaks a previously working application must be either fixed or reverted. There are exceptions, though, including a 5.10 patch that has been turning up regressions ever since. The story that emerges here shows what can happen when the goals of stability, avoiding security problems, and code cleanup run into conflict.
The set_fs() function was added to the kernel early in its history; it was not in the initial 0.01 release, but was added before the 0.10 release in late 1991. Normally, kernel code that is intended to access user-space memory will generate an error if it attempts to access kernel space instead; this restriction prevents, for example, attempts by an attacker to access kernel memory via system calls. A call to set_fs(KERNEL_DS) can be used to lift the restriction when the need arises; a common use case for set_fs() is to be able to perform file I/O from within the kernel. Calling set_fs(USER_DS) puts the restriction back.
The problem with set_fs() is that it turns out to be easy to forget the second set_fs() call to restore the protection of kernel space, leading directly to the "total compromise" scenario that kernel developers will normally take some pains to avoid. Numerous such bugs have been fixed over the years, but it had long been clear that the real solution was to just get rid of set_fs() entirely and adopt safer ways of accessing kernel memory when needed.
5.19 Merge window, part 1 [LWN.net]
As of this writing, just under 4,600 non-merge changesets have been pulled into the mainline repository for the 5.19 development cycle. The 5.19 merge window is clearly well underway. The changes pulled so far cover a number of areas, including the core kernel, architecture support, networking, security, and virtualization; read on for highlights from the first part of this merge window.
Adding an in-kernel TLS handshake
Adding support for an in-kernel TLS handshake was the topic of a combined storage and filesystem session at the 2022 Linux Storage, Filesystem, Memory-management and BPF Summit (LSFMM). Chuck Lever and Hannes Reinecke led the discussion on ways to add that support; they are interested in order to provide TLS for network storage and filesystems. But there are likely other features, such as QUIC support, that could use an in-kernel TLS implementation.
Challenges with fstests and blktests [LWN.net]
The challenges of testing filesystems and the block layer were the topic of a combined storage and filesystem session led by Luis Chamberlain at the 2022 Linux Storage, Filesystem, Memory-management and BPF Summit (LSFMM). His goal is to reduce the amount of time it takes to test new features in those areas, but one of the problems that he has encountered is a lack of determinism in the test results. It is sometimes hard to distinguish problems in the kernel code from problems in the tests themselves.
He began with a request to always use the term "fstests" for the tests that have been known as "xfstests". The old name is confusing, especially for new kernel developers, because the test suite has long been used for testing more than just the XFS filesystem. It is not just new folks, though; even at previous LSFMMs, he has seen people get confused by the "xfs" in the name.
Filesystems, testing, and stable trees [LWN.net]
In a filesystem session at the 2022 Linux Storage, Filesystem, Memory-management and BPF Summit (LSFMM), Amir Goldstein led a discussion about the stable kernel trees. Those trees, and especially the long-term support (LTS) versions, are used as a basis for a variety of Linux-based products, but the kind of testing that is being done on them for filesystems is lacking. Part of the problem is that the tests target filesystem developers so they are not easily used by downstream consumers of the stable kernel trees.
His interest in the problem comes about because he is using the 5.10 LTS kernel and the XFS filesystem. He realized that XFS is not being maintained in that kernel; there are only three XFS patches backported to it in the past two years or more. There is some history behind that, though most in the room already know it, he said.
ID-mapped mounts [LWN.net]
The ID-mapped mounts feature was added to Linux in 5.12, but the general idea behind it goes back a fair bit further. There are a number of different situations where the user and group IDs for files on disk do not match the current human (or process) user of those files, so ID-mapped mounts provide a way to resolve that problem—without changing the files on disk. The developer of the feature, Christian Brauner, led a discussion at the 2022 Linux Storage, Filesystem, Memory-management and BPF Summit (LSFMM) on ID-mapped mounts.
He began with an introduction. There are multiple use cases, but he likes to talk about portable home directories first because they are not related to containers, which many think is the sole reason for ID-mapped mounts. A portable home directory would be on some kind of removable media that can be attached to various systems, some of which have a different user and group ID for the user, but, of course, the media has fixed values for those IDs. ID-Mapped mounts allow the device to be mounted on the system with the IDs remapped to those of the user on the local system.
