Language Selection

English French German Italian Portuguese Spanish

Supercomputing Articles

Filed under
  • Exascale meets hyperscale: How high-performance computing is transitioning to cloud-like environments

    Twice a year the high-performance computing (HPC) community anxiously awaits the announcement of the latest edition of the Top500 list, cataloging the most powerful computers on the planet. The excitement of a supercomputer breaking the coveted exascale barrier and moving into the top position typically overshadows the question of which country will hold the record. As it turned out, the top 10 systems on the November 2019 Top500 list are unchanged from the previous revision with Summit and Sierra still holding #1 and #2 positions, respectively. Despite the natural uncertainty around the composition of the Top500 list, there is little doubt about software technologies that are helping to reshape the HPC landscape. Starting at the International Supercomputing conference earlier this year, one of the technologies leading this charge is containerization, lending further credence to how traditional enterprise technologies are influencing the next generation of supercomputing applications.

    Containers are borne out of Linux, the operating system underpinning Top500 systems. Because of that, the adoption of container technologies has gained momentum and many supercomputing sites already have some portion of their workflows containerized. As more supercomputers are being used to run artificial intelligence (AI) and machine learning (ML) applications to solve complex problems in science-- including disciplines like astrophysics, materials science, systems biology, weather modeling and cancer research, the focus of the research is transitioning from using purely computational methods to AI-accelerated approaches. This often requires the repackaging of applications and restaging the data for easier consumption, where containerized deployments are becoming more and more important.

  • Exploring AMD’s Ambitious ROCm Initiative

    Three years ago, AMD released the innovative ROCm hardware-accelerated, parallel-computing environment [1] [2]. Since then, the company has continued to refine its bold vision for an open source, multiplatform, high-performance computing (HPC) environment. Over the past three years, ROCm developers have contributed many new features and components to the ROCm open software platform.

    ROCm is a universal platform for GPU-accelerated computing. A modular design lets any hardware vendor build drivers that support the ROCm stack [3]. ROCm also integrates multiple programming languages and makes it easy to add support for other languages. ROCm even provides tools for porting vendor-specific CUDA code into a vendor-neutral ROCm format, which makes the massive body of source code written for CUDA available to AMD hardware and other hardware environments.

  • High-Performance Python – GPUs

    When GPUs became available, C code via CUDA, a parallel computing platform and programming model developed by Nvidia for GPUs, was the logical language of choice. Since then, Python has become the tool of choice for machine learning, deep learning, and, to some degree, scientific code in general.

    Not long after the release of CUDA, the Python world quickly created tools for use with GPUs. As with new technologies, a plethora of tools emerged to integrate Python with GPUs. For some time, the tools and libraries were adequate, but soon they started to show their age. The biggest problem was incompatibility.

    If you used a tool to write code for the GPU, no other tools could read or use the data on the GPU. After making computations on the GPU with one tool, the data had to be copied back to the CPU. Then a second tool had to copy the data from the CPU to the GPU before commencing its computations. The data movement between the CPU and the GPU really affected overall performance. However, these tools and libraries allowed people to write functions that worked with Python.

    In this article, I discuss the Python GPU tools that are being actively developed and, more importantly, likely to interoperate. Some tools don’t need to know CUDA for GPU code, and other tools do need to know CUDA for custom Python kernels.

  • Porting CUDA to HIP

    You’ve invested money and time in writing GPU-optimized software with CUDA, and you’re wondering if your efforts will have a life beyond the narrow, proprietary hardware environment supported by the CUDA language.

    Welcome to the world of HIP, the HPC-ready universal language at the core of AMD’s all-open ROCm platform [1]. You can use HIP to write code once and compile it for either the Nvidia or AMD hardware environment. HIP is the native format for AMD’s ROCm platform, and you can compile it seamlessly using the open source HIP/​Clang compiler. Just add CUDA header files, and you can also build the program with CUDA and the NVCC compiler stack (Figure 1).

  • OpenMP – Coding Habits and GPUs

    When first using a new programming tool or programming language, it’s always good to develop some good general habits. Everyone who codes with OpenMP directives develops their own habits – some good and some perhaps not so good. As this three-part OpenMP series finishes, I highlight best practices from the previous articles that can lead to good habits.

    Enamored with new things, especially those that drive performance and scalability, I can’t resist throwing a couple more new directives and clauses into the mix. After covering these new directives and clauses, I will briefly discuss OpenMP and GPUs. This pairing is fairly recent, and compilers are still catching up to the newer OpenMP standards, but it is important for you to understand that you can run OpenMP code on targeted offload devices (e.g., GPUs).

  • News and views on the GPU revolution in HPC and Big Data:

    Exploring AMD's Ambitious ROCm Initiative
    Porting CUDA to HIP
    Python with GPUs
    OpenMP – Coding Habits and GPUs

AMD and NVIDIA at SC19

  • AMD Announces Radeon Open Compute ROCm 3.0

    AMD just sent out their press release for SuperComputing 19 week in Denver. It turns out being released for SC19 is the latest major iteration of Radeon Open Compute, ROCm 3.0.

    AMD's press release mentions ROCm 3.0 being released though as of writing it has yet to appear via the ROCm repositories on GitHub. Once the actual drop happens, I'll certainly be writing about it and digging deeper into the other changes in full.

  • NVIDIA Releasing Reference Design For Stuffing Their GPUs Into Arm Servers

    NVIDIA CEO Jensen Huang announced from SC19 today in Denver that they are releasing a "reference design" of hardware and software to help in deployments of their graphics processors within Arm-based servers focused on HPC and AI.

    This isn't too surprising considering NVIDIA's past forays into ARM-based servers for HPC/AI and it was just a few months ago NVIDIA said they would be supporting CUDA on ARM Linux for HPC servers. NVIDIA has already been supporting their software for ARM-based SoCs for years as well considering their Tegra platform and Linux 4 Tegra (L4T).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

More in Tux Machines

Screencasts/Audiocasts: Zorin OS 15.1 Run Through, Linux Action News and Open Source Security Podcast

XFS - 2019 Development Retrospective

We frequently hear two complaints lodged against XFS -- memory reclamation runs very slowly because XFS inode reclamation sometimes has to flush dirty inodes to disk; and deletions are slow because we charge all costs of freeing all the file's resources to the process deleting files. Dave Chinner and I have been collaborating this year and last on making those problems go away. Dave has been working on replacing the current inode memory reclaim code with a simpler LRU list and reorganizing the dirty inode flushing code so that inodes aren't handed to memory reclaim until the metadata log has finished flushing the inodes to disk. This should eliminate the complaints that slow IO gets in the way of reclaiming memory in other parts of the system. Meanwhile, I have been working on the deletion side of the equation by adding new states to the inode lifecycle. When a file is deleted, we can tag it as needing to have its resources freed, and move on. A background thread can free all those resources in bulk. Even better, on systems with a lot of IOPs available, these bulk frees can be done on a per-AG basis with multiple threads. Read more Also: Oracle Talks Up Recent Features For XFS + Some File-System Improvements On The Horizon

Review: OpenIndiana 2019.10 Hipster

For me, the conclusion after battling with OpenIndiana for a few weeks is quite simple: the operating system's aim is to "ensure the continued availability of an openly developed distribution based on OpenSolaris" and it clearly achieves that goal. However, it does very little beyond that modest aim, and the lack of documentation makes it difficult to use OpenIndiana for people unfamiliar with OpenSolaris and/or Solaris. My advice for Linux users like me is to take plenty of time to get familiar with the operating system. At times I found using OpenIndiana hugely frustrating but that was largely because things weren't working as I expected. I am fairly confident that I would have solved most of the issues I encountered if I had spent more time with OpenIndiana. Some issues may be show-stoppers, including OpenIndiana's struggle with connecting to wireless networks and the limited amount of applications that are available. Many of these issues can be solved though. One of the main struggles I faced was finding documentation. The best place to look for information appears to be Oracle's Solaris documentation. Unfortunately, OpenIndiana's Hipster Handbook is not much use. It is mostly populated with content placeholders and the section on web servers counts exactly two words: "Apache" and "nginx". Even new features, such as the "native and metadata encryption" for ZFS and an option to disable hyper-treading are not mentioned in the handbook. At times OpenIndiana felt like an operating system that belongs in a museum. The set-up is quite old-school, the theme looks very dated and everything felt sluggish; the system is slow to boot and launching applications always took just a little too long for my liking. OpenIndiana's stand-out features are also nothing new - they are what made OpenSolaris a powerful operating system a decade years ago. Yet, in the Linux world there aren't many distros - if any - that have something like the Time Slider. openSUSE comes close but, in my humble opinion, OpenIndiana's Time Slider is more advanced and easier to use than OpenSUSE's Snapper. I am hoping Linux will catch up when it comes to OpenIndiana's ZFS goodness. Ubuntu is working on integrating ZFS, and I for one hope that in time there will be a Time Slider in file managers such as GNOME Files and Dolphin. Read more

OpenWiFi Open-Source Linux-compatible WiFi Stack Runs on FPGA Hardware

WiFi is omnipresent on most connected hardware, and when it works it’s great, but when there are issues oftentimes they can not be solved because the firmware is a closed-source binary. Read more