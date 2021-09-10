Kernel: Latest From LWN, Early Patches Bring BPF to Linux Scheduler
More IOPS with BIO caching
Once upon a time, block storage devices were slow, to the point that they often limited the speed of the system as a whole. A great deal of effort went into carefully ordering requests to get the best performance out of the storage device; achieving that goal was well worth expending some CPU time. But then storage devices got much faster and the equation changed. Fancy I/O-scheduling mechanisms have fallen by the wayside and effort is now focused on optimizing code so that the CPU can keep up with its storage. A block-layer change that was merged for the 5.15 kernel shows the kinds of tradeoffs that must be made to get the best performance from current hardware.
Within the block layer, an I/O operation is represented by struct bio; an instance of this structure is usually just called a "BIO". Contained within a BIO are a pointer to the relevant block device, a description of the buffer(s) to be transferred, a pointer to a function to call when the operation completes, and a surprising amount of ancillary information. A BIO must be allocated, managed, and eventually freed for every I/O operation executed by the system. Given that a large, busy system with fast block devices can generate millions of I/O operations per second (IOPS), huge numbers of BIOs will be going through this life cycle in a constant stream.
Not-so-anonymous virtual memory areas
Computing terminology can be counterintuitive at times, but even a longtime participant in the industry may have to look twice at the notion of named anonymous memory. That, however, is just the concept that this patch set posted by Suren Baghdasaryan proposes to add. There are, it seems, developers who find the idea useful enough to not only overcome the initial cognitive dissonance that comes with it, but also to resurrect an eight-year-old patch to get it into the kernel.
Memory used by user space is divided into two broad categories: file-backed and anonymous. A file-backed page of memory has a direct correspondence to a page in a file in persistent storage; when the page is clean, its contents are identical to what is found on disk. An anonymous page, instead, is not associated with a file in the filesystem; these pages hold a process's data areas, stacks, and so on. If an anonymous page must be written to persistent storage (to reclaim the page for another user, usually), space must be allocated in the swap area to hold its contents.
Whether a given process's memory use is dominated by file-backed or anonymous pages varies from one workload to the next. In many cases, the bulk of a process's pages will be anonymous; this, it seems, is more likely in workloads with a lot of cloud-computing clients, which tend not to use many local files. Android devices are one place where this sort of behavior can be found. If one is trying to optimize the memory usage of such a workload, anonymous pages can pose a challenge; since the pages are anonymous, with no information about how they were created, it is difficult to know what any given anonymous page is being used for.
5.15 Merge window, part 1
As of this writing, 3,440 non-merge changesets have been pulled into the mainline repository for the 5.15 development cycle. A mere 3,440 patches may seem like a slow start, but those patches are densely populated with significant new features. Read on for a look at what the first part of the 5.15 merge window has brought.
Early Patches Bring BPF To The Linux Scheduler - Phoronix
The latest area where BPF is looking to expand within the Linux kernel is its CFS scheduler.
Roman Gushchin of Facebook published an initial patch series for providing initial BPF support within the Linux CFS scheduler as a way for external code to safely alter select kernel decisions.
