Language Selection

English French German Italian Portuguese Spanish

Coverage From 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM)

Filed under
  • Is it time to remove ZONE_DMA?

    The DMA zone (ZONE_DMA) is a memory-management holdover from the distant past. Once upon a time, many devices (those on the ISA bus in particular) could only use 24 bits for DMA addresses, and were thus limited to the bottom 16MB of memory. Such devices are hard to find on contemporary computers. Luis Rodriguez scheduled the last memory-management-track session of the 2018 Linux Storage, Filesystem, and Memory-Management Summit to discuss whether the time has come to remove ZONE_DMA altogether.

  • Zone-lock and mmap_sem scalability

    The memory-management subsystem is a central point that handles all of the system's memory, so it is naturally subject to scalability problems as systems grow larger. Two sessions during the memory-management track of the 2018 Linux Storage, Filesystem, and Memory-Management Summit looked at specific contention points: the zone locks and the mmap_sem semaphore.

  • Hotplugging and poisoning

    Memory hotplugging is one of the least-loved areas of the memory-management subsystem; there are many use cases for it, but nobody has taken ownership of it. A similar situation exists for hardware page poisoning, a somewhat neglected mechanism for dealing with memory errors. At the 2018 Linux Storage, Filesystem, and Memory-Management summit, Michal Hocko and Mike Kravetz dedicated a pair of brief memory-management track sessions to problems that have been encountered in these subsystems, one of which seems more likely to get the attention it needs than the other.

  • Reworking page-table traversal

    A system's page tables are organized into a tree that is as many as five levels deep. In many ways those levels are all similar, but the kernel treats them all as being different, with the result that page-table manipulations include a fair amount of repetitive code. During the memory-management track of the 2018 Linux Storage, Filesystem, and Memory-Management Summit, Kirill Shutemov proposed reworking how page tables are maintained. The idea was popular, but the implementation is likely to be tricky.

  • get_user_pages() continued

    At a plenary session held relatively early during the 2018 Linux Storage, Filesystem, and Memory-Management Summit, the developers discussed a number of problems with the kernel's get_user_pages() interface. During the waning hours of LSFMM, a tired (but dedicated) set of developers convened again in the memory-management track to continue the discussion and try to push it toward a real solution.

    Jan Kara and Dan Williams scheduled the session to try to settle on a way to deal with the issues associated with get_user_pages() — in particular, the fact that code that has pinned pages in this way can modify those pages in ways that will surprise other users, such as filesystems. During the first session, Jérôme Glisse had suggested using the MMU notifier mechanism as a way to solve these problems. Rather than pin pages with get_user_pages(), kernel code could leave the pages unpinned and respond to notifications when the status of those pages changes. Kara said he had thought about the idea, and it seemed to make some sense.

  • XFS parent pointers

    At the 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), Allison Henderson led a session to discuss an XFS feature she has been working on: parent pointers. These would be pointers stored in extended attributes (xattrs) that would allow various tools to reconstruct the path for a file from its inode. In XFS repair scenarios, that path will help with reconstruction as well as provide users with better information about where the problems lie.

  • Shared memory mappings for devices

    In a short filesystem-only discussion at the 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), Jérôme Glisse wanted to talk about some (more) changes to support GPUs, FPGAs, and RDMA devices. In other talks at LSFMM, he discussed changes to struct page in support of these kinds of devices, but here he was looking to discuss other changes to support mapping a device's memory into multiple processes. It should be noted that I had a hard time following the discussion in this session, so there may be significant gaps in what follows.

  • A new API for mounting filesystems

    The mount() system call suffers from a number of different shortcomings that has led some to consider a different API. At last year's Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), that someone was Miklos Szeredi, who led a session to discuss his ideas for a new filesystem mounting API. Since then, David Howells has been working with Szeredi and VFS maintainer Al Viro on this API; at the 2018 LSFMM, he presented that work.

    He began by noting some of the downsides of the current mounting API. For one thing, you can pass a data page to the mount() call, but it is limited to a single page; if too many options are needed, or simply options with too many long parameters, they won't fit. The error messages and information on what went wrong could be better. There are also filesystems that have a bug where an invalid option will fail the mount() call but leave the superblock in an inconsistent state due to earlier options having been applied. Several in the audience were quick to note that both ext4 and XFS had fixed the latter bug along the way, though there may still be filesystems that have that behavior.

  • Controlling block-I/O latency

    Chris Mason and Josef Bacik led a brief discussion on the block-I/O controller for control groups (cgroups) in the filesystem track at the 2018 Linux Storage, Filesystem, and Memory-Management Summit. Mostly they were just aiming to get feedback on the approach they have taken. They are trying to address the needs of their employer, Facebook, with regard to the latency of I/O operations.

    Mason said that the goal is to strictly control the latency of block I/O operations, but that the filesystems themselves have priority inversions that make that difficult. For Btrfs and XFS, they have patches to tag the I/O requests, which mostly deals with the problem. They have changes for ext4 as well, but those are not quite working yet.

  • A mapping layer for filesystems

    In a plenary session on the second day of the Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), Dave Chinner described his ideas for a virtual block address-space layer. It would allow "space accounting to be shared and managed at various layers in the storage stack". One of the targets for this work is for filesystems on thin-provisioned devices, where the filesystem is larger than the storage devices holding it (and administrators are expected to add storage as needed); in current systems, running out of space causes huge problems for filesystems and users because the filesystem cannot communicate that error in a usable fashion.

    His talk is not about block devices, he said; it is about a layer that provides a managed logical-block address (LBA) space. It will allow user space to make fallocate() calls that truly reserve the space requested. Currently, a filesystem will tell a caller that the space was reserved even though the underlying block device may not actually have that space (or won't when user space goes to use it), as in a thin-provisioned scenario. He also said that he would not be talking about his ideas for a snapshottable subvolume for XFS that was the subject of his talk at 2018.

More in Tux Machines

Plasma 5.12.5 bugfix update for Kubuntu 18.04 LTS – Testing help required

Are you using Kubuntu 18.04, our current LTS release? We currently have the Plasma 5.12.5 LTS bugfix release available in our Updates PPA, but we would like to provide the important fixes and translations in this release to all users via updates in the main Ubuntu archive. This would also mean these updates would be provide by default with the 18.04.1 point release ISO expected in late July. Read more

New Arduino boards include first FPGA model

Arduino launched a “MKR Vidor 4000” board with a SAMA21 MCU and Cyclone 10 FPGA, as well as an “Uno WiFi Rev 2” with an ATmega4809 MCU. Both boards have a crypto chip and ESP32-based WiFi module. In conjunction with this weekend’s Maker Faire Bay Area, Arduino launched two Arduino boards that are due to ship at the end of June. The MKR Vidor 4000 is the first Arduino board equipped with an field programmable . The Intel Cyclone 10 FPGA. will be supported with programming libraries and a new visual editor. The Arduino Uno WiFi Rev 2, meanwhile, revises the Arduino Uno WiFi with a new Microchip ATmega4809 MCU. It also advances to an ESP32-based u-blox NINA-W102 WiFi module, which is also found on the Vidor 4000. Read more

DragonFlyBSD 5.3 Works Towards Performance Improvements

Given that DragonFlyBSD recently landed some SMP performance improvements and other performance optimizations in its kernel for 5.3-DEVELOPMENT but as well finished tidying up its Spectre mitigation, this weekend I spent some time running some benchmarks on DragonFlyBSD 5.2 and 5.3-DEVELOPMENT to see how the performance has shifted for an Intel Xeon system. Read more

Red Hat News: KVM, OpenStack Platform 13 and More