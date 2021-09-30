Kernel (LWN Articles) and Graphics Intel AMX support in 5.16 The x86 instruction set is large, but that doesn't mean it can't get bigger yet. Upcoming Intel processors will feature a new set of instructions under the name of "Advanced Matrix Extensions" (AMX) that can be used to operate on matrix data. After a somewhat bumpy development process, support for AMX has found its way into the upcoming 5.16 kernel. Using it will, naturally, require some changes by application developers. AMX (which is described in this document) is meant to be a general architecture for the acceleration of matrix operations on x86 processors. In its initial form, it implements a set of up to eight "tiles", which are arrays of 16 64-byte rows. Programmers can store matrices in these tiles of any dimension that will fit therein; a matrix of 16x16 32-bit floating-point values would work, but other geometries are supported too. The one supported operation currently will multiply the matrices stored in two tiles, then add the result to a third tile. By chaining these operations, multiplication of matrices of any size can be implemented. Evidently other operations are meant to come in the future. While AMX may seem like a feature aimed at numerical analysis, the real target use case would appear to be machine-learning applications. That would explain why 16-bit floating-point arithmetic is supported, but 64-bit is not. The design of AMX gives the kernel control over whether these features can be used by any given process. There are a couple of reasons for this, one being that AMX instructions, as one might imagine, use a lot of processor resources. A process doing heavy AMX work on a shared computer may adversely affect other processes. But AMX also cannot be supported properly unless both the kernel and the user-space process are ready for it.

The balance between features and performance in the block layer Back in September, LWN reported on a series of block-layer optimizations that enabled a suitably equipped system to sustain 3.5 million I/O operations per second (IOPS). That optimization work has continued since then, and those 3.5 million IOPS would be a deeply disappointing result now. A recent disagreement over the addition of a new feature has highlighted the potential cost of a heavily optimized block layer, though; when is a feature deemed important enough to outweigh the drive for maximum performance? Block subsystem maintainer Jens Axboe has continued working to make block I/O operations go faster. A recent round of patches tweaked various fast paths, changed the plugging mechanism to use a singly linked list, and made various other little changes. Each is a small optimization, but the work adds up over time; the claimed level of performance is now 8.2 million IOPS — well over September's rate, which looked good at the time. This work has since found its way into the mainline as part of the block pull request for 5.16. So far, so good; few people will argue with massive performance improvements. But they might argue with changes that threaten to interfere, even in a tiny way, with those improvements. Consider, for example, this patch set from Jane Chu. It adds a new flag (RWF_RECOVERY_DATA) to the preadv2() and pwritev2() system calls that can be used by applications trying to recover from nonvolatile-memory "poisoning". Implementations of nonvolatile memory have different ways of detecting and coping with data corruption; Intel memory, it seems, will poison the affected range, meaning that it cannot be accessed without generating an error (which then turns into a SIGBUS signal). An application can respond to that error by reading or writing the poisoned range with the new flag; a read will replace the poisoned data with zeroes (allowing the application to obtain whatever data is still readable), while a write will overwrite that data and attempt to clear the poisoned status. Either way, the application can attempt to recover from the problem and continue running.

5.16 Merge window, part 1 As of this writing, Linus Torvalds has pulled exactly 6,800 non-merge changesets into the mainline repository for the 5.16 kernel release. That is probably a little over half of what will arrive during this merge window, so this is a good time to catch up on what has been pulled so far. There are many significant changes and some large-scale restructuring of internal kernel code, but relatively few ground-breaking new features.

NVIDIA 470.62.12 Vulkan Beta Driver For Linux Updates Video Support - Phoronix NVIDIA today released their latest Vulkan beta drivers for Windows and Linux. With the NVIDIA 470.62.12 beta driver released today there is updated Vulkan Video API support based on the upstream spec as of the newly-released Vulkan 1.2.199. There are some subtle changes to the Vulkan Video capabilities for specification compliance. NVIDIA's Vulkan beta driver remains the leading driver for Vulkan Video API support right now and they were quick in supporting the provisional extensions since their debut earlier this year. Finally at least Vulkan Video is seeing movement by Mesa drivers.

Oracle Linux 8.5 and Red Hat Leftovers Release Notes for Oracle Linux 8.5 Oracle® Linux 8: Release Notes for Oracle Linux 8.5 provides information about the new features and known issues in the Oracle Linux 8.5 release. This document may be updated after it is released.

Getting started with Red Hat Insights and OpenSCAP for compliance reporting Sysadmins trying to ride herd over tens, hundreds or thousands of systems need tools to help keep systems in compliance with policies and security standards. In this post we'll look at using Red Hat Insights compliance service to manage compliance at scale. Verifying that your organization's systems satisfy your compliance requirements is something that takes time and effort. Too often it's only done on an ad hoc basis. That approach may work for organizations with a limited number of hosts, but performing this task at scale is problematic with complex environments and limited resources. Fortunately, organizations that use Red Hat Enterprise Linux (RHEL) in a standard operating environment (SOE) can take advantage of Red Hat Insights and its Compliance service to proactively and efficiently manage their regulatory compliance requirements at scale. Combining Red Hat Insights with Red Hat Smart Management and the Red Hat Ansible Automation Platform to create an automated process for compliance configuration, validation, and remediation can lessen the administrative burden of your compliance workload.

Red Hat OpenShift extends High Performance Computing (HPC) infrastructure from edge to exascale Massive amounts of data are racing towards us at an unheard of velocity. But processing this data quickly, at a centralized location, is no longer possible for most organizations. How might we better act on this data to preserve its relevance? The answer lies in acting on the data as close to the source as possible. This means making data-driven decisions or getting answers to the most pressing questions in real-time, across all of your computing environments - from the edge to the exascale. If you’re processing massive amounts of data at scale with multiple tasks running simultaneously, you are likely already using high-performance computing (HPC). Oil & gas exploration, complex financial modeling and DNA mapping and sequencing are just a few modern workstreams that have massive data requirements and rely on HPC to drive breakthrough discoveries. With HPC, running advanced and computational problems and simulations in parallel on highly optimized hardware and super fast networks can help deliver answers and create outcomes more quickly. Because of HPC’s sheer scale, it would be challenging for the traditional datacenter infrastructure to deliver similar results. And also because its massive scale "just works," HPC has largely gone unchanged over the past 20 years. Today, however, we are seeing HPC undergo a transformation as it faces increased demand from the applications running on it.