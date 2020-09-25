NVIDIA GeForce vs. AMD Radeon Vulkan Neural Network Performance With NCNN
With having added Tencent's NCNN tests to the Phoronix Test Suite with Vulkan acceleration, here is a look at the real-world impact by using RealSR-NCNN for scaling up with RealSR. Various NVIDIA GeForce and AMD Radeon graphics cards were tested for this initial NCNN / RealSR-NCNN Vulkan comparison.
This is our first time looking at how well Vulkan performs in this area with the current state of the Linux drivers. The GeForce hardware was tested with the latest 450 series proprietary driver while on the Radeon side it was with Linux 5.9 and Mesa 20.3-devel using the RADV Vulkan driver. One of the Tencent developers working on NCNN has commented as well that using RADV's ACO offers a big boost for the performance, which fortunately is the current default for the RADV Vulkan driver.
Kernel Space: Trenchboot, RAID10, Spelling Mistakes and Initcalls
-
For a while now Oracle engineers and others have been working on Trenchboot as a means of secure launch/boot support when paired with the likes of Intel TXT and AMD SKINIT for trusted execution and configuring each piece of the software boot chain for trusted/secure handling. The latest kernel patches have been sent out for review for secure launching of the kernel.
Earlier this year Oracle engineers sent out Linux kernel patches for Trenchboot while on Thursday the newest work surfaced.
-
Queued today into the block subsystem's "-next" area ahead of the Linux 5.10 cycle kicking off next month are some MD RAID enhancements.
In particular, thanks to Red Hat's Xiao Ni is improved RAID10 discard request handling. The change with a set of five SSDs in a RAID10 array on a test system dropped the mkfs.xfs time for creating an XFS file-system taking 4 minutes 39 seconds to less than 1 second... Quite a noticeable difference in that scenario.
-
The Linux 5.9-rc6 kernel source contains over 300,000 literal strings used in kernel messages of various sorts (errors, warnings, etc) and it is no surprise that typos and spelling mistakes slip into these messages from time to time.
To catch spelling mistakes I run a daily automated job that fetches the tip from linux-next and runs a fast spelling checker tool that finds all spelling mistakes and then diff's these against the results from the previous day. The diff is emailed to me and I put my kernel janitor hat on, fix these up and send these to the upstream developers and maintainers.
The spelling checker tool is a fast-and-dirty C parser that finds literal strings and also variable names and checks these against a US English dictionary containing over 100,000 words. As fun weekend side project I hand optimized the checker to be able to parse and spell check several millions lines of kernel C code per second.
-
In the first part of this blog post series on Linux kernel initcalls, we looked at their purpose, their usage, and ways to debug them (using initcall_debug or FTrace). In this second part, we'll go deeper into the implementation of initcalls, with a look at the colorful __device_initcall() macro, the rootfs initcall, and how modules can be executed.
Graphics: AMD, KWinFT and Zink
-
s a nice Friday afternoon patch series there is the 275k lines of code for wiring up the next-generation AMD Van Gogh APU support under Linux.
Earlier this week there were the Mesa patches for AMD Dimgrey Cavefish and Van Gogh while today the kernel-side portion for Van Gogh was sent out for the AMDGPU kernel driver.
-
AMD submitted the 45 Linux kernel patches, which weigh in at 275,000 lines of code, to enable Linux support for the coming APUs. The patches also reveal that Van Gogh comes with Video Core Next 3.0, which supports AV1 decode. In the past, Phoronix has found patches indicating VCN 3.0 (video encode) is native to the Navi 2 graphics engine.
Pairing the Navi 2 / RDNA 2 graphics engine with DDR5/LPDDR5 could unlock quite a bit of graphical horsepower, as integrated graphics engines tend to respond well to increased memory throughput. Van Gogh is also predicted to come with Zen 2 cores, and it will certainly be interesting to see what kind of impact the improved memory throughput has on the Zen 2 architecture.
-
Today new beta versions for all KWinFT projects – that are KWinFT, Wrapland, Disman and KDisplay – were released. With that we are on target for the full release which is aligned with Plasma 5.20 on October 13.
Big changes will unquestionable come to Disman, a previously stifled library for display management, which now learns to stand on its own feet providing universal means for the configuration of displays with different windowing systems and Wayland compositors.
But also for the compositor KWinFT a very specific yet important feature got implemented and a multitude of stability fixes and code refactors were accomplished.
In the following we will do a deep dive into reasons and results of this recent efforts.
-
Briefly, zink copies the framebuffer state, there’s a number of conditions under which a new pipeline object is needed, which all result in ctx->gfx_pipeline_state.hash = 0;. Other than this, there’s sample count check for sample changes so that the shader can be modified if necessary, and then there’s the setup for creating the Vulkan framebuffer object as well as the renderpass object in get_framebuffer().
Eagle-eyed readers will immediately spot the problem here, which is, aside from the fact that there’s not actually any reason to be setting up the framebuffer or renderpass here, how zink is also flushing the current batch if a renderpass is active.
The change I made here was to remove everything related to Vulkan from here, and move it to zink_begin_render_pass(), which is the function that the driver uses to begin a renderpass for a given batch.
Mozilla: Firefox for Android Nightly and Surveillance ('Telemetry')
-
As we mentioned recently, we’re adding Recommended extensions to Firefox for Android Nightly as a broader set of APIs become available to accommodate more add-on functionality. We just updated the collection with some new Recommended extensions, including…
Mobile favorites Video Background Play Fix (keeps videos playing in the background even when you switch tabs) and Google Search Fixer (mimics the Google search experience on Chrome) are now in the fold.
Privacy related extensions FoxyProxy (proxy management tool with advanced URL pattern matching) and Bitwarden (password manager) join popular ad blockers Ghostery and AdGuard.
Dig deeper into web content with Image Search Options (customizable reverse image search tool) and Web Archives (view archived web pages from an array of search engines). And if you end up wasting too much time exploring images and cached pages you can get your productivity back on track with Tomato Clock (timed work intervals) and LeechBlock NG (block time-wasting websites).
-
In this session, you won’t learn about joins or windows or timers or any other advanced features of Beam. Instead, we will focus on the real-world complexity that comes from simply moving data from one system to another safely. How do we model data as it passes from one transform to another? How do we handle errors? How do we test the system? How do we organize the code to make the pipeline configurable for different source and destination systems?
We will explore how each of these questions are addressed in Mozilla’s open source codebase for ingesting telemetry data from Firefox clients. By the end of the session, you’ll be equipped to explore the codebase and documentation on your own to see how these concepts are composed together.
-
On the Glean team we make an effort to move as much of the logic as possible to glean-core, so that we don’t have too much code duplication on the language bindings and guarantee standardized behaviour throughout all platforms.
Since that is the case, it was counterintuitive for me, that when we set out to build a version of Glean for the web, we wouldn’t rely on the same glean-core as all our other language bindings. The hypothesis was: let’s make JavaScript just another language binding, by making our Rust core compile to a target that runs on the browser.
Rust is notorious for making an effort to have a great Rust to Wasm experience, and the Rust and Webassembly working group has built awesome tools that make boilerplate for such projects much leaner.
-
Mozilla’s history is steeped in openness and transparency – it’s simply core to what we do and how we see ourselves in the world. We are always looking for ways to bring our mission to life in ways that help create a healthy internet and support the Mozilla Manifesto. One of our commitments says “We are committed to an internet that elevates critical thinking, reasoned argument, shared knowledge, and verifiable facts”.
To this end, we have spent a good amount of time considering how we can publicly share our Mozilla telemetry data sets – it is one of the most simple and effective ways we can enable collaboration and share knowledge. But, only if it can be done safely and in a privacy protecting, principled way. We believe we’ve designed a way to do this and we are excited to outline our approach here.
Making data public not only allows us to be transparent about our data practices, but directly demonstrates how our work contributes to our mission. Having a publicly available methodology for vetting and sharing our data demonstrates our values as a company. It will also enable other research opportunities with trusted scientists, analysts, journalists, and policymakers in a way that furthers our efforts to shape an internet that benefits everyone.
