Following an excellent two hour run-through that involved practical work and covered enough ground to give a general idea of how grid computing works, I was given access to the HPC Wales network for the next year. Some of it I was already vaguely familiar with, having studied parallel computing during my own failed attempt to buid a working (Quantian OS) Linux cluster many years ago.
HPC Wales has been in the works since at least 2010. Now operational with roughly 17,000 procssor cores from supercomputing clusters at several universities across Wales, it’s the largest unclassified distributed computing network in the United Kingdom.
The bulk of it, I understand, is based at Swansea and Cardiff universities. We have a couple of rackmount systems at the University of Glamorgan (now part of the University of South Wales) hooked up to it, which I estimate adds roughly 900 processors to the network. I was hoping the excellent computing department at Newport university would be included when I first read about this, but that hasn’t happened (yet).
Because of the steps taken to minimise overhead, the process of submitting jobs is very specific and isn’t straightforward, hence the need for computer scientists with a UNIX/Linux background to deliver the training and support. Everything was done purely through the command line, although there is a workaround for anyone who desperately wants a GUI – using the Xming GUI-rendering client that connects to the remote system’s XServer.
Even the programs must (or really should) be compiled and optimised specifically for the platform, and using the Intel compiler rather than gcc. That means jobs are typically submitted as archived source files. Of course, it’s up to the scentists from whatever fields to provide the initial FORTRAN code, or at least the specifics of what needs processing.
Hello World (Supercomputing Version)
I did mention there was some hands-on work in the session. As with all programming tutorials there’s a ‘Hello World’ example, which we did in FORTRAN to demonstrate the overall process. The program itself wasn’t parallelised but called functions from the MPI libraries. It was compiled with a configuration file (run.lsf) that defined the number of processors to run it, and in this case I used 24 cores (and later 96 cores) on the University of Glamorgan cluster.
The line in run.lsf looks rather like a BASH command. Here ‘$NPROC‘ is the number of processor cores. The file ‘hello.exe‘ wasn’t actually a Windows executable, and was given that extension simply to mark it as a compiled program in a PuTTY client. After execution the results are piped to a log file for the job ID:
mpirun -r ssh -n $NPROC ./hello.exe >& log.HELLO.$LSB_JOBID
Submitting the compiled program then quickly running the ‘$bjobs‘ command (there’s nothing funny about ‘bjobs’!), the output showed which cores were being assigned to the task. I couldn’t remember the command option that lists the assignments in real-time:
After roughly 30 seconds, ‘$bjobs‘ was run again to see whether it was still queued. It wasn’t, so the chances are the log file would show all 24 processors had executed the ‘Hello World’ program:
Applications in Real-World Projects
The program described above sent identical instructions to multiple processors. With a lot of work and FORTRAN skill, it’s possible to break a large job into smaller sections that are distributed to however many processor cores. For something like weather prediction and simulated worlds, the environment being modelled would typically be divided into small areas, the variables for each being handled by a processor. Values would be passed between them and results later aggregated to provide the overall picture – the more processors are used, the more granular and accurate the simulation.