Language Selection

English French German Italian Portuguese Spanish

Coverage From 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM)

Filed under
  • Is it time to remove ZONE_DMA?

    The DMA zone (ZONE_DMA) is a memory-management holdover from the distant past. Once upon a time, many devices (those on the ISA bus in particular) could only use 24 bits for DMA addresses, and were thus limited to the bottom 16MB of memory. Such devices are hard to find on contemporary computers. Luis Rodriguez scheduled the last memory-management-track session of the 2018 Linux Storage, Filesystem, and Memory-Management Summit to discuss whether the time has come to remove ZONE_DMA altogether.

  • Zone-lock and mmap_sem scalability

    The memory-management subsystem is a central point that handles all of the system's memory, so it is naturally subject to scalability problems as systems grow larger. Two sessions during the memory-management track of the 2018 Linux Storage, Filesystem, and Memory-Management Summit looked at specific contention points: the zone locks and the mmap_sem semaphore.

  • Hotplugging and poisoning

    Memory hotplugging is one of the least-loved areas of the memory-management subsystem; there are many use cases for it, but nobody has taken ownership of it. A similar situation exists for hardware page poisoning, a somewhat neglected mechanism for dealing with memory errors. At the 2018 Linux Storage, Filesystem, and Memory-Management summit, Michal Hocko and Mike Kravetz dedicated a pair of brief memory-management track sessions to problems that have been encountered in these subsystems, one of which seems more likely to get the attention it needs than the other.

  • Reworking page-table traversal

    A system's page tables are organized into a tree that is as many as five levels deep. In many ways those levels are all similar, but the kernel treats them all as being different, with the result that page-table manipulations include a fair amount of repetitive code. During the memory-management track of the 2018 Linux Storage, Filesystem, and Memory-Management Summit, Kirill Shutemov proposed reworking how page tables are maintained. The idea was popular, but the implementation is likely to be tricky.

  • get_user_pages() continued

    At a plenary session held relatively early during the 2018 Linux Storage, Filesystem, and Memory-Management Summit, the developers discussed a number of problems with the kernel's get_user_pages() interface. During the waning hours of LSFMM, a tired (but dedicated) set of developers convened again in the memory-management track to continue the discussion and try to push it toward a real solution.

    Jan Kara and Dan Williams scheduled the session to try to settle on a way to deal with the issues associated with get_user_pages() — in particular, the fact that code that has pinned pages in this way can modify those pages in ways that will surprise other users, such as filesystems. During the first session, Jérôme Glisse had suggested using the MMU notifier mechanism as a way to solve these problems. Rather than pin pages with get_user_pages(), kernel code could leave the pages unpinned and respond to notifications when the status of those pages changes. Kara said he had thought about the idea, and it seemed to make some sense.

  • XFS parent pointers

    At the 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), Allison Henderson led a session to discuss an XFS feature she has been working on: parent pointers. These would be pointers stored in extended attributes (xattrs) that would allow various tools to reconstruct the path for a file from its inode. In XFS repair scenarios, that path will help with reconstruction as well as provide users with better information about where the problems lie.

  • Shared memory mappings for devices

    In a short filesystem-only discussion at the 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), Jérôme Glisse wanted to talk about some (more) changes to support GPUs, FPGAs, and RDMA devices. In other talks at LSFMM, he discussed changes to struct page in support of these kinds of devices, but here he was looking to discuss other changes to support mapping a device's memory into multiple processes. It should be noted that I had a hard time following the discussion in this session, so there may be significant gaps in what follows.

  • A new API for mounting filesystems

    The mount() system call suffers from a number of different shortcomings that has led some to consider a different API. At last year's Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), that someone was Miklos Szeredi, who led a session to discuss his ideas for a new filesystem mounting API. Since then, David Howells has been working with Szeredi and VFS maintainer Al Viro on this API; at the 2018 LSFMM, he presented that work.

    He began by noting some of the downsides of the current mounting API. For one thing, you can pass a data page to the mount() call, but it is limited to a single page; if too many options are needed, or simply options with too many long parameters, they won't fit. The error messages and information on what went wrong could be better. There are also filesystems that have a bug where an invalid option will fail the mount() call but leave the superblock in an inconsistent state due to earlier options having been applied. Several in the audience were quick to note that both ext4 and XFS had fixed the latter bug along the way, though there may still be filesystems that have that behavior.

  • Controlling block-I/O latency

    Chris Mason and Josef Bacik led a brief discussion on the block-I/O controller for control groups (cgroups) in the filesystem track at the 2018 Linux Storage, Filesystem, and Memory-Management Summit. Mostly they were just aiming to get feedback on the approach they have taken. They are trying to address the needs of their employer, Facebook, with regard to the latency of I/O operations.

    Mason said that the goal is to strictly control the latency of block I/O operations, but that the filesystems themselves have priority inversions that make that difficult. For Btrfs and XFS, they have patches to tag the I/O requests, which mostly deals with the problem. They have changes for ext4 as well, but those are not quite working yet.

  • A mapping layer for filesystems

    In a plenary session on the second day of the Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), Dave Chinner described his ideas for a virtual block address-space layer. It would allow "space accounting to be shared and managed at various layers in the storage stack". One of the targets for this work is for filesystems on thin-provisioned devices, where the filesystem is larger than the storage devices holding it (and administrators are expected to add storage as needed); in current systems, running out of space causes huge problems for filesystems and users because the filesystem cannot communicate that error in a usable fashion.

    His talk is not about block devices, he said; it is about a layer that provides a managed logical-block address (LBA) space. It will allow user space to make fallocate() calls that truly reserve the space requested. Currently, a filesystem will tell a caller that the space was reserved even though the underlying block device may not actually have that space (or won't when user space goes to use it), as in a thin-provisioned scenario. He also said that he would not be talking about his ideas for a snapshottable subvolume for XFS that was the subject of his talk at 2018.

More in Tux Machines

Today in Techrights

Android Leftovers

Radeon ROCm 1.9.1 vs. NVIDIA OpenCL Linux Plus RTX 2080 TensorFlow Benchmarks

Following the GeForce RTX 2080 Linux gaming benchmarks last week with now having that non-Ti variant, I carried out some fresh GPU compute benchmarks of the higher-end NVIDIA GeForce and AMD Radeon graphics cards. Here's a look at the OpenCL performance between the competing vendors plus some fresh CUDA benchmarks as well as NVIDIA GPU Cloud TensorFlow Docker benchmarks. This article provides a fresh look at the Linux GPU compute performance for NVIDIA and AMD. On the AMD side was the Linux 4.19 kernel paired with the ROCm 1.9.1 binary packages for Ubuntu 18.04 LTS. ROCm continues happily running well on the mainline kernel with the latest releases, compared to previously relying upon the out-of-tree/DKMS kernel modules for compute support on the discrete Radeon GPUS. ROCm 2.0 is still supposed to be released before year's end so there will be some fresh benchmarks coming up with that OpenCL 2.0+ implementation when the time comes. The Radeon CPUs tested were the RX Vega 56 and RX Vega 64 as well as tossing in the R9 Fury for some historical context. Read more

KDE Applications 18.12 Are Waiting for You

It's that time of the year again. Everyone is in a festive mood and excited about all the new things they're going to get. It's only natural, since it's the season of the last KDE Applications release for this year! With more than 140 issues resolved and dozens of feature improvements, KDE Applications 18.12 are now on its way to your operating system of choice. We've highlighted some changes you can look forward to. Read more Also: KDE Applications 18.12 Released With File Manager Improvements, Konsole Emoji