What is an HPC cluster?
According to a NetApp post, an HPC cluster “consists of hundreds or thousands of computing servers that are networked together.”
Each server in an HPC cluster is called a node. “The nodes in each cluster work in parallel with each other, increasing processing speed to deliver high-performance computing,” the message notes.
Cameron Chehreh, CTO and vice president of pre-sales engineering at Dell EMC Federal, explains to FedTech that these nodes can include processing power via CPU and GPU to servers; tools such as NVIDIA and Intel software development kits; frameworks including TensorFlow, MXNet and Caffe; and essential platforms with Kubernetes and Pivotal Cloud Foundry.
As a guide from Iowa State University points out, there can be different types of nodes for different types of tasks. These may include a header node or a login node, where users log in to HPC systems; specialized data transfer nodes; regular computing nodes; so-called “fat” nodes that have at least one terabyte of memory; graphics processing nodes; and more.
“All nodes in the cluster have the same components as a laptop or desktop computer: CPU cores, memory, and disk space,” the guide states. “The difference between a personal computer and a cluster node lies in the quantity, quality, and power of the components.”
RELATED: How can high performance computing medical research?
HPC applications in government
In addition to enabling critical research such as COVID-19 treatments, government HPCs support a wide range of cutting-edge research that could not be achieved with regular computing power.
The National Renewable Energy Laboratory of the Department of Energy, for example, manages its installation of high-performance computing users for scientists and engineers “working on solving complex analysis problems. data and computing related to energy efficiency and renewable energy technologies, ”says the NREL.
“The work done on NREL’s HPC systems leads to greater efficiency and reduced costs of these technologies, including wind and solar energy, energy storage and large-scale integration of renewable energy into the power grid, “the lab notes.
HPCs also enable research collaborations between government and the private sector on other types of energy innovation and advanced manufacturing techniques.
In November, the LLNL announced a partnership with the Oak Ridge and Rolls-Royce National Laboratory to use HPC to “study a key modeling component in heat treatment processes for gas turbine parts.” LLNL also announced a partnership with Toyota Motor Engineering & Manufacturing in North America to “improve understanding of the relationship between properties in specific solid electrolytes for lithium-ion batteries.”
LLNL also announced in November the launch of a new HPC cluster, called Ruby, which runs on an Intel Xeon Platinum-based cluster. Ruby is used for unclassified programmatic tasks in support of the National Nuclear Safety Administration’s mission to maintain the country’s nuclear weapons depot. Ruby is also used, according to the LLNL, for research into “asteroid detection, moon formation, high-fidelity fission and other basic science discoveries.”
DEEP DIVING: How do agencies use cutting-edge computing in the field?
HPC storage for government agencies
As NetApp points out instead, HPC clusters network to the HPC system’s data storage to capture output. Storage is a critical element for an HPC architecture.
“To run at peak performance, each component must keep pace with the others,” NetApp notes. “For example, the storage component must be able to feed and ingest data to computing servers and their servers as quickly as they are processed.”
Similarly, HPC network components “must be able to support high-speed data transfer between computing servers and data storage.”
If one component, including storage, can’t keep up with the rest, “the performance of the entire HPC infrastructure suffers,” NetApp notes.