Language Selection

English French German Italian Portuguese Spanish

LWN's Latest Articles (No Paywall) About Linux Kernel

Filed under
  • Patching until the COWs come home (part 2)

    Part 1 of this series described the copy-on-write (COW) mechanism used to avoid unnecessary copying of pages in memory, then went into the details of a bug in that mechanism that could result in the disclosure of sensitive data. A patch written by Linus Torvalds and merged for the 5.8 kernel appeared to fix that problem without unfortunate side effects elsewhere in the system. But COW is a complicated beast and surprises are not uncommon; this particular story was nowhere near as close to an end as had been thought.

    Torvalds's expectations quickly turned out to be overly optimistic. In August 2020, a bug was reported by Peter Xu; it affected userfaultfd(), which is a subsystem for handling page faults in a user-space process. This mechanism allows the handling process to (among other things) write-protect ranges of memory and be notified of attempts to write to that range. One use case for this feature is to prevent pages from being modified while the monitoring process writes their contents to secondary storage. That write can, however, result in a read-only get_user_pages() (GUP) call on the write-protected pages, which should be fine. Remember, though, that Torvalds's fix worked by changing read-only get_user_pages() calls to look like calls for write access; this was done to force the breaking of COW references on the pages in question. In the userfaultfd() case, that generates an unexpected write fault in the monitoring process, with the result that this process hangs.

    The initial version of Xu's fix went in the direction of more fine-grained rules for breaking COW by GUP, as had been anticipated in the original fix, and added some userfaultfd()-specific handling. But during the discussion, Torvalds instead proposed a completely different approach, which resulted in another patch set from Xu. These patches essentially revert Torvalds's change and abandon the approach of always breaking COW for GUP calls. Instead, do_wp_page(), which handles write faults to a write-protected page, is modified by commit 09854ba94c6a ("mm: do_wp_page() simplification") to more strictly check if the page is shared by multiple processes.

  • Lockless patterns: some final topics

    So far, this series has covered five common lockless patterns in the Linux kernel; those are probably the five that you will most likely encounter when working on Linux. Throughout this series, some details have been left out and some simplifications were made in the name of clarity. In this final installment, I will sort out some of these loose ends and try to answer what is arguably the most important question of all: when should you use the lockless patterns that have been described here?

    ions are. In these cases, applying lockless techniques to the fast path can be valuable.

    For example, you could give each thread a queue of requests from other threads and manage them through single-consumer linked lists. Perhaps you can trigger the processing of requests using the cross-thread notification pattern from the article on full memory barriers. However, these techniques only make sense because the design of the whole system supports them. In other words, in a system that is designed to avoid scalability bottlenecks, common sub-problems tend to arise and can often be solved efficiently using the patterns that were presented here.

    When seeking to improve the scalability of a system with lockless techniques, it is also important to distinguish between lock-free and wait-free algorithms. Lock-free algorithms guarantee that the system as a whole will progress, but do not guarantee that each thread will progress; lock-free algorithms are rarely fair, and if the number of operations per second exceeds a certain threshold, some threads might end up failing so often that the result is a livelock. Wait-free algorithms additionally ensure per-thread progress. Usually this comes with a significant price in terms of complexity, though not always; for example message passing and cross-thread notification are both wait-free.

    Looking at the Linux llist primitives, llist_add() is lock-free; on the consumer side, llist_del_first() is lock-free, while llist_del_all() is wait-free. Therefore, llist may not be a good choice if many producers are expected to contend on calls to llist_add(); and using llist_del_all() is likely better than llist_del_first() unless constant-time consumption is an absolute requirement. For some architectures, the instruction set does not allow read-modify-write operations to be written as wait-free code; if that is the case, llist_del_all() will only be lock-free (but still preferable, because it lets the consumer perform fewer accesses to the shared data structure).

    In any case, the definitive way to check the performance characteristics of your code is to benchmark it. Intuition and knowledge of some well-known patterns can guide you in both the design and the implementation phase, but be ready to be proven wrong by the numbers.

  • GDB and io_uring

    A problem reported when attaching GDB to programs that use io_uring has led to a flurry of potential solutions, and one that was merged into Linux 5.12-rc5. The problem stemmed from a change made in the 5.12 merge window to how the threads used by io_uring were created, such that they became associated with the process using io_uring. Those "I/O threads" were treated specially in the kernel, but that led to the problem with GDB (and likely other ptrace()-using programs). The solution is to treat them like other threads because it turned out that trying to make them special caused more problems than it solved.

    Stefan Metzmacher reported the problem to the io-uring mailing list on March 20. He tried to attach GDB to the process of a program using io_uring, but the debugger went "into an endless loop because it can't attach to the io_threads". PF_IO_WORKER threads are used by io_uring for operations that might block; he followed up the bug report with two patch sets that would hide these threads in various ways. The idea behind hiding them is that if GDB cannot see the threads, it will not attempt to attach to them. Prior to 5.12, the threads existed but were not associated with the io_uring-using process, so GDB would not see them.

    It is, of course, less than desirable for developers to be unable to run a debugger on code that uses io_uring, especially since io_uring support in their application is likely to be relatively new, thus it may need more in the way of debugging. The maintainer of the io_uring subsystem, Jens Axboe, quickly stepped in to help Metzmacher solve the problem. Axboe posted a patch set that included a way to hide the PF_IO_WORKER threads, along with some tweaks to the signal handling for these threads; in particular, he removed the ability for them to receive signals at all.

More in Tux Machines

today's howtos

  • How to use the W3M text-based web browser on Linux

    Do you need a text-based web browser on Linux to use in your terminal? Don’t like using Lynx, as it seems dated and sluggish? Hoping for something better? Check out W3M. It’s a modern text-based terminal web browser for Linux that has much more to offer.

  • How to Install or Enable Cockpit on AlmaLinux 8 - Linux Shout

    The Cockpit on AlmaLinux is a server management platform that allows administrators to easily manage and control their GUI or CLI Linux server systems remotely using a browser. Among other things, admins can take a look at the systemd journal, check the load or start and stop services. It has a responsive design thus we can also use it conveniently on tablet s and smartphones. We can monitor our remote server performance using just a browser without actually having physical access to it. Furthermore, we can also access the command shell with root access to issue commands and install various packages over the server remotely. Since AlmaLinux 8 is based on RHEL just like CentOS 8, this means by default out of the box, the Cockpit is already installed on your system. Just we need to enable it.

  • How to Export and Delete Saved Passwords in Firefox - Make Tech Easier

    Firefox comes with a built-in password manager, also known as Lockwise. The Lockwise password manager is safeguarded with your Firefox account and allows you to access your passwords on the desktop and mobile. If you have been using Lockwise but now want to migrate to another password manager app, here we show how you can export and delete your saved passwords in Firefox.

  • How to Install Docker on Ubuntu Linux

    Docker has taken the software engineering industry by storm, and it has not only revolutionized the way we ship and deploy software but has also changed how engineers set up software development environments on their computers. This guide shows you how to get started with Docker by installing it on Ubuntu Linux 20.04 (Focal Fossa), the latest Long Term Support (LTS) version of Ubuntu at the time of this writing.

EndeavourOS: Our April release is available

We are proud to announce our second release of 2021 and this one is a bit more than a refresh ISO release, so before you hit the download button and go play with it, just sit back and let us inform you first because we are really excited about this release. [...] The other new feature on the knowledge base are video tutorials, like the wiki articles, this category will expand over time and at the moment it contains general Linux and Arch specific tutorials from the Youtube channels Chris Titus Tech and EF Linux. Very soon videos from DistroTube, Eric Adams and TechHut will also be added to enhance the experience. Read more

Zorin OS 16 Beta Released with Remarkable Changes. Download and Test Now.

The Zorin OS team announced the release of the Zorin OS 16 Beta which is immediately available for download and testing. With this pre-release, Zorin OS promises some massive changes. Let's take a look. Read more

Android Leftovers