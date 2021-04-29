LWN on Kernel: Linux 5.13, Sticky Groups, BPF
-
The misc control group
Control groups (cgroups) are meant to limit access to a shared resource among processes in the system. One such resource is the values used to specify an encrypted-memory region for a virtual machine, such as the address-space identifiers (ASIDs) used by the AMD Secure Encrypted Virtualization (SEV) feature. Vipin Sharma set out to add a control group for these ASIDs back in September; based on the feedback, though, he expanded the idea into a controller to track and limit any countable resource. The patch set became the controller for the misc control group and has been merged for Linux 5.13.
The underlying idea is to allow administrators (or cloud orchestration systems) to enforce limits on the number of these IDs that can be consumed by the processes in a control group. In a cloud setting, those processes could correspond to virtual machines being run under KVM. The initial posting for ASIDs was met with a suggestion from Sean Christopherson to expand the reach of the controller to govern more types of encryption IDs beyond just those used by AMD SEV. Intel has an analogous Trust Domain Extensions (TDX) feature that uses key IDs, which are also a resource that may need limiting. The s390 architecture has its secure execution IDs (SEIDs), as well; those are far less scarce than the others, but could still benefit from a controller to limit the consumption of them.
-
Exported-symbol changes in 5.13
There have been many disagreements over the years in the kernel community concerning the exporting of internal kernel symbols to loadable modules. Exporting a symbol often exposes implementation decisions to outside code, makes it possible to use (or abuse) kernel functionality in unintended ways, and makes future changes harder. That said, there is no authority overseeing the exporting of symbols and no process for approving exports; discussions only tend to arise when somebody notices a change that they don't like. But it is not particularly hard to detect changes in symbol exports from one kernel version to the next, and doing so can give some insights into the kinds of changes that are happening under the hood.
The kernel has many thousands of functions and data structures; most of those are private to a given source file, while others are made available to the kernel as a whole. Loadable modules are special, though; they only have access to symbols that have been explicitly exported to them with EXPORT_SYMBOL() (or one of a few variants); many symbols that are available to code built into the kernel image are unavailable to loadable modules. The intent of this limitation is to keep the interface to modules relatively narrow and manageable.
It is far from clear that this objective has been achieved, though. The 5.12 kernel exported 31,695 symbols to modules, which does not create an impression of a narrow interface. That number grew to 31,822 in 5.13-rc1. That is an increase of 127 symbols, but the actual story is a bit more complicated than that; 244 exported symbols were removed over this time, while 371 were added. The curious can see the full sets of added and removed symbols on this page.
-
Sticky groups in the shadows [LWN.net]
Group membership is normally used to grant access to some resource; examples might include using groups to control access to a shared directory, a printer, or the ability to use tools like sudo. It is possible, though, to use group membership to deny access to a resource instead, and some administrators make use of that feature. But groups only work as a negative credential if the user cannot shed them at will. Occasionally, some way to escape a group has turned up, resulting in vulnerabilities on systems where they are used to block access; despite fixes in the past, it turns out that there is still a potential problem with groups and user namespaces; this patch set from Giuseppe Scrivano seeks to mitigate it through the creation of "shadow" groups.
There are two ways to prevent access to a file based on group membership. One of those is to simply set the group owner of the file to the group that is to be denied, then set the permissions to disallow group access. Members of the chosen group will be denied access to the file, even if the world permissions would otherwise allow that access. The alternative is to use access control lists to explicitly deny access to the intended group or groups. Once again, any process in any of the designated groups will not be allowed access.
By way of a refresher, it's worth remembering that Linux has two separate concepts of group membership. The "primary group" or "effective group ID" is the group that will be attached to new files in the absence of other constraints. This was once the only group associated with a process in Unix systems, and is set with setgid(). The "supplementary" groups are a newer addition that allow a process to belong to multiple groups simultaneously; the list of supplementary groups can be changed with setgroups(). Negative access-control decisions are usually (but not necessarily) based on supplementary group membership.
-
Calling kernel functions from BPF
The kernel's BPF virtual machine allows programs loaded from user space to be safely run in the kernel's context. That functionality would be of limited use, however, without the ability for those programs to interact with the rest of the kernel. The interface between BPF and the kernel has been kept narrow for a number of good reasons, including safety and keeping the kernel in control of the system. The 5.13 kernel, though, contains a feature that could, over time, widen that interface considerably: the ability to directly call kernel functions from BPF programs.
The immediate driver for this functionality is the implementation of TCP congestion-control algorithms in BPF, a capability that was added to the 5.6 kernel release by Martin KaFai Lau. Actual congestion-control implementations in BPF turned out to reimplement a number of functions that already exist in the kernel, which seems less than fully optimal; it would be better to just use the existing functions in the kernel if possible. The new function-calling mechanism — also implemented by Lau — makes that possible.
-
- Login or register to post comments
- Printer-friendly version
- 90 reads
- PDF version
More in Tux Machines
- Highlights
- Front Page
- Latest Headlines
- Archive
- Recent comments
- All-Time Popular Stories
- Hot Topics
- New Members
How Linux made a school pandemic-ready
More than 20 years ago, when Robert Maynord started teaching at Immaculate Heart of Mary School in Monona, Wisconsin, the school had only eight functioning computers, all running Windows 95. Through his expertise in and enthusiasm for Linux and open source software, Robert has transformed the school community, its faculty, and its students, who are in kindergarten to eighth grade. "In those early years, it quickly became apparent that paying license fees to Microsoft for each computer, in addition to purchasing all the software to install, was absurd when the computer itself was only worth $20," says Robert. So he began installing Linux on the school's computers.
Today in Techrights
today's howtos
Awesome Screenshot Tool ksnip 1.9.0 Released with Huge Set of Updates
The cross-platform screenshot and annotation application ksnip 1.9.0 release bring 30+ new features and a handful of bug fixes. We summarize the release for you.
Recent comments
45 min 59 sec ago
1 hour 11 min ago
1 hour 21 min ago
1 hour 26 min ago
9 hours 10 min ago
9 hours 42 min ago
9 hours 57 min ago
10 hours 7 min ago
10 hours 24 min ago
11 hours 37 min ago