Skip to content

0 Introduction

This document presents a step-by-step guide towards the deployment of a virtual HPC cluster using the community-driven open-source HPC software suite - OpenHPC.

Here, we are presenting a supporting guide to the official OpenHPC 2.x Install Recipe with an emphasis on the learning experience for new-comers to HPC System deployments.

We will be focusing on a specific implementation of the OpenHPC software stack on a virtual cluster, within a virtual lab environment intended primarily for learning the process of deploying and managing an OpenHPC (Warewulf/Slurm/RockyLinux8) cluster.

Keywords

  • OpenHPC
  • HPC Ecosystems
  • Introduction
  • HPC
  • Virtual Lab
  • Hands-on
  • Virtual Cluster
  • System Administrator

Whereas the official recipe guide assumes access to physical hardware to facilitate the OpenHPC deployment, the virtual lab presented in this guide serves to expand access to the OpenHPC experience to everyone by deploying OpenHPC to virtual machines hosted on your local machine.

Background

This guide was produced by the Advanced Computer Engineering (ACE) Lab at the Centre for High Performance Computing (CHPC), for the HPC Ecosystems Project community.

Acknowledgements

This guide owes its existence to a team of contributors and dedicated reviewers:

  • Lara Timm
  • Bryan Johnston
  • Eugene de Beste
  • Mmabatho Hashatsi

  • Rammolenyane Lethaha (reviewer)

  • Anton Limbo (reviewer)
  • Mopeli Khama (reviewer)
  • John Poole (reviewer)

0.0 JUMPING RIGHT IN

If you are someone that prefers to save hours of planning through months of debugging, you can jump straight into the deployment steps in Chapter 2. We would prefer it if you stay and read over the preparatory notes first, though.

0.1 Objectives and Outcomes


The purpose of this document is: to inform and guide the reader to deploy a basic operational virtual cluster using OpenHPC. The material is presented as a step-by-step walkthrough of the software stack deployment.

While the virtual lab does not strictly focus on teaching, it is hoped that some understanding and knowledge will be gained through completing the hands-on virtual deployment, to serve either as a precursor for the deployment of a physical High Performance Computing (HPC) system, or as a learning environment for further exploration into HPC and cluster environments.

0.2 Target Audience


The original iteration of this virtual lab was conceived to target site-designated HPC System Administrators of the HPC Ecosystems community.

The content, however, is universal and will be relevant to:

  • anyone who wishes to learn or practice deploying an OpenHPC 2.x cluster in a virtual lab environment.;
  • experienced HPC System Administrators exploring OpenHPC;
  • new-comers to HPC System Administration;
  • parallel computing educators;
  • HPC-curious people seeking to explore the world of HPC.

0.3 Requirements


To successfully complete this virtual lab, the following is required:

  • Computing resources capable of hosting the virtual lab (see 0.5 Prerequisites).
  • Basic Linux BASH / shell experience
  • Basic understanding of HPC and parallel computing

0.4 Assumptions


The official OpenHPC install recipe is targeted at experienced Linux system administrators for HPC environments.

This virtual lab seeks to compliment the OpenHPC install recipe by bridging the gap for those that are new to HPC environments and have basic Linux system administrator experience. If you are able to answer YES to the following questions, then you should experience no difficulty in completing this virtual lab.

Readiness Check

  • Do you know how to navigate the file system using the Linux shell?
  • Have you installed packages via the Linux shell using yum, apt or compilers?
  • Do you know how to stop and start services in a Linux shell?
  • Do you understand basic principles of computer networking such as IPv4, PXE, DNS?
  • Do you know what High Performance Computing is and why you are learning to install a system that supports it?

If you answered NO to any of the previous questions, it may be worth revisiting the source of the question and ensuring you are comfortable with the details before going any further.

Tip

This guide is not going away, so take your time and be sure that you are comfortable and ready to continue!

0.5 Prerequisites

0.5.1 Resources


Virtual Lab Deployment

After the initial deployment of your virtual lab, you will have an OpenHPC-ready VirtualBox Virtual Machine (VM) provisioned by Vagrant. This VM will act as your management node and will manage your virtual cluster. Although we do not detail the use of hypervisors other than VirtualBox, an appropriately configured Vagrant definition file should allow you to achieve the same results on another hypervisor of your choice. We provide further detail about the tools used in this virtual lab, in Chapter 2.

Virtual Cluster Deployment

To demonstrate the full configuration and features of a cluster, two additional low-performance VM's (compute nodes) will be spawned.

Tuned Performance for Virtual Lab

In a standard physical HPC environment, your compute nodes are not low-performance. The compute nodes in this lab are intentionally low performance to constrain the overall resource consumption on the computer that you are using to complete the virtual lab.

In order to support this three-node virtual cluster, we recommend that your host machine (the one you are performing this lab on) should have at least:

  • 10GB available storage (20GB+ is preferred)
  • 4GB available RAM (8GB+ is preferred)
    • The smshost VM uses 1GB RAM (1024MB).
    • the compute host VMs each use 3GB RAM (3072MB).

In exceptional situations, if your local computer (the host machine) is unable to manage the three-node workload, it may be necessary to reduce the virtual cluster from three nodes to two nodes - the smshost and one compute node; this will afford you with most of the intended learning experience but please note that it is not a full experience of deploying or using an HPC environment.

Your Mileage May Vary

This virtual lab has been thoroughly tested with VirtualBox and Vagrant following the steps outlined in the sections that follow. If you follow the same software stack implementation, you should expect to enjoy similar results.

The following files and packages will be required to get you started. We will provide the specific version files to download in Chapter 2:

Resource Link
Oracle VirtualBox (~100MB) https://www.virtualbox.org/
Hashicorp Vagrant (~250MB) https://www.vagrantup.com/
Ecosystems OpenHPC2.x GitLab folder Detailed in Section 2.3

0.5.2 Workload and Time


Once you have downloaded the intial resources, installing the various starting components should take approximately 10-15 minutes. This process involves VirtualBox installation, Vagrant installation and cloning the required lab files from the Ecosystems GitLab. This, however, is merely the beginning.

The actual time it will take to complete this virtual lab and deploy a successful virtual cluster will depend on a number of factors. These include:

  • The speed of your internet connection
  • Your level of familiarity with the Linux command line and the syntax/commands used in the guide
  • How much time you invest into understanding the steps that you complete along the way
  • Your willingness to read the guide thoroughly before executing each step (HIGHLY RECOMMENDED)
  • Your familiarity with the HPC design being implemented in the guide
  • Your willingness to plan before executing

Preliminary runs of the guide indicate that a standard user experience will take 15 to 20 hours of hands-on time.

Tip - Read the instructions carefully!

Make sure that you understand the instructions before executing them. You should know what it is that you are doing so that you can fix things if something does not work as expected.

Note - the OpenHPC recipe guide's input.local is equivalent to input.local.lab in this lab

When following the official OpenHPC install recipe, one is required to configure the file input.local according to your system.

Since this is a virtual lab (and to make things easier for the user) input.local is preconfigured and replaced with a simplified version - input.local.lab.

Please ensure that you read through and understand input.local.lab so that you can identify where things may have gone wrong, if they do, later in the virtual lab!

0.6 Why this Virtual Lab?


The main purpose of this training material is to provide a robust, easy-to-follow, standalone guide to be used as a virtual training activity to prepare for the deployment of HPC systems.

A large amount of time, research and peer review was undertaken to develop what we believe to be the best possible virtual OpenHPC 2.x training material for the HPC Ecosystems Community. At the time of this content development, the only other public on-demand OpenHPC virtual training material was our own OpenHPC 1.3.x virtual guide (to which this guide is the sequel!).

0.7 Materials Used


This OpenHPC 2.x virtual lab integrates the following materials:

0.8 Conventions


The examples in this guide follow a number of conventions:

  1. Input boxes are displayed as code boxes, where:

    • The text to the left of the # or $ symbol is the current working directory or path
    • The text to the right of the # symbol is the input parameters

    For example, consider:

    [~/openhpc-2.x-virtual-lab/]#
    vagrant ssh
    
    - The current working directory is ~/openhpc-2.x-virtual-lab/ and
    - The input parameters are vagrant ssh.

  2. Variable substitution is indicated in two ways:

    • Within arrow brackets -- <your_variable_here>
    • Using environment variables defined in input.local.lab -- ${environment_variable}

0.9 Key


Additional information, tips, things to note, milestones and recaps are captured in this guide using the following callout box styles:

Note

Notes are intended as additional background information or observations to address obvious queries / questions / anomalies.

They will be presented in grey boxes.

Tip

Tips are intended to provide additional functionality and user tips to improve the user learning experience and deployment process.

They will be presented in blue boxes.

Important

Important warnings are included to advise of common pitfalls or mis-steps that may have a significant impact on the deployment process.

They will be presented in red boxes.

Recap

Recaps are intended to provide a summary of the important concepts and topics that have been covered at the conclusion of the current section.

They will be presented in purple boxes.

Feedback

Feedback will be presented as dropdown menus at the end of each chapter where you can provide us with valuable feedback by rating the chapter's content, presentation, etc. and by offering comments on how we might improve the product.

Extra Information

Drop-downs exist throughout the virtual lab to offer additional information to enhance the learning experience. Click and learn!

Congratulations

Milestones will be presented to mark significant achievements in the deployment process; these are also useful for conveying progress markers when seeking feedback or assistance or progress reports.

They will be presented in green boxes.

And with that, congratulations - you're done with the Introduction chapter! On to Chapter 1...


Bug report

Click here if you wish to report a bug.

Provide feedback

Click here if you wish to provide us feedback on this chapter.