PIConGPU in 5 Minutes on Hemera
A guide to run, but not understand PIConGPU. It is aimed at users of the high performance computing (HPC) cluster “Hemera” at the HZDR, but should be applicable to other HPC clusters with slight adjustments.
This guide needs shell access (probably via ssh) and git. Consider getting familiar with the shell (command line, usually bash) and git. Please also read the tutorial for your local HPC cluster.
See also
- resources for the command line (bash)
- resources for git
official tutorial (also available as man page gittutorial(7)) | w3school tutorial | brief introduction | cheatsheet (by github)
- Hemera at HZDR
official website | presentation internal links: wiki | storage layout
We will use the following directories:
~/src/picongpu
: source files from github~/fwkt_v100_picongpu.profile
: load the dependencies for your local environment~/picongpu-projects
: scenarios to simulate/bigdata/hplsim/external/alice
: result data of the simulation runs (scratch storage)
Please replace them whenever appropriate.
Get the Source
Use git to obtain the source and use the current dev
branch and put it into ~/src/picongpu
:
mkdir -p ~/src
git clone https://github.com/ComputationalRadiationPhysics/picongpu ~/src/picongpu
Note
If you get the error git: command not found
load git by invoking module load git
and try again.
Attention: the example uses the dev
branch instead of the latest stable release.
Due to driver changes on hemera the modules configuration of the last release might be outdated.
Setup
You need a lot of dependencies.
Luckily, other people already did the work and prepared a profile that you can use. Copy it to your home directory:
cp ~/src/picongpu/etc/picongpu/hemera-hzdr/fwkt_v100_picongpu.profile.example ~/fwkt_v100_picongpu.profile
This profile determines which part of the HPC cluster (partition, also: queue) – and thereby the compute device(s) (type of CPUs/GPUs) – you will use. This particular profile will use NVIDIA Volta V100 GPUs.
You can view the full list of available profiles on github (look for NAME.profile.example
).
For this guide we will add our scratch directory location to this profile.
Edit the profile file using your favorite editor.
If unsure use nano: nano ~/fwkt_v100_picongpu.profile
(save with Control-o, exit with Control-x).
Go to the end of the file and add a new line:
export SCRATCH=/bigdata/hplsim/external/alice
(Please replace alice
with your username.)
Note
This is the location where runtime data and all results will be stored. If you’re not on Hemera make sure you select the correct directory: Consult the documentation of your HPC cluster where to save your data. On HPC clusters this is probably not your home directory.
In the profile file you can also supply additional settings, like your email address and notification settings.
Now activate your profile:
source ~/fwkt_v100_picongpu.profile
Warning
You will have to repeat this command every time you want to use PIConGPU on a new shell, i.e. after logging in.
Now test your new profile:
echo $SCRATCH
That should print your data directory. If that works make sure that this directory actually exists by executing:
mkdir -p $SCRATCH
ls -lah $SCRATCH
If you see output similar to this one everything worked and you can carry on:
total 0
drwxr-xr-x 2 alice fwt 40 Nov 12 10:09 .
drwxrwxrwt 17 root root 400 Nov 12 10:09 ..
Create a Scenario
As an example we will use the predefined LaserWakefield example. Create a directory and copy it:
mkdir -p ~/picongpu-projects/tinkering
pic-create $PIC_EXAMPLES/LaserWakefield ~/picongpu-projects/tinkering/try01
cd ~/picongpu-projects/tinkering/try01
Usually you would now adjust the files in the newly created directory ~/picongpu-projects/tinkering/try01
– for this introduction we will use the parameters as provided.
Note
The command pic-create and the variable $PIC_EXAMPLES
have been provided because you loaded the file ~/fwkt_v100_picongpu.profile
in the previous step.
If this fails (printing pic-create: command not found
), make sure you load the PIConGPU profile by executing source ~/fwkt_v100_picongpu.profile
.
Compile and Run
Now use a compute node. Your profile provides a helper command for that:
getDevice
(You can now run hostname
to see which node you are using.)
Now build the scenario:
# switch to the scenario directory if you haven't already
cd ~/picongpu-projects/tinkering/try01
pic-build
This will take a while, go grab a coffee. If this fails, read the manual or ask a colleague.
After a successfull build, run (still on the compute node, still inside your scenario directory):
tbg -s bash -t $PICSRC/etc/picongpu/bash/mpiexec.tpl -c etc/picongpu/1.cfg $SCRATCH/tinkering/try01/run01
tbg: tool provided by PIConGPU
bash
: the “submit system”, e.g. usesbatch
for slurm$PICSRC
: the path to your PIConGPU source code, automatically set when sourcingfwkt_v100_picongpu.profile
$PICSRC/etc/picongpu/bash/mpiexec.tpl
: options for the chosen submit systemetc/picongpu/1.cfg
: runtime options (number of GPUs, etc.)$SCRATCH/tinkering/try01/run01
: not-yet-existing destination for your result files
Note
Usually you would use the workload manager (SLURM on Hemera) to submit your jobs instead of running them interactively like we just did. You can try that with:
# go back to the login node
exit
hostname
# ...should now display hemera4.cluster or hemera5.cluster
# resubmit your simulation with a new directory:
tbg -s sbatch -c etc/picongpu/1.cfg -t etc/picongpu/hemera-hzdr/fwkt_v100.tpl $SCRATCH/tinkering/try01/run02
This will print a confirmation message (e.g. Submitted batch job 3769365
),
but no output of PIConGPU itself will be printed.
Using squeue -u $USER
you can view the current status of your job.
Note that we not only used a different “submit system” sbatch
,
but also changed the template file to etc/picongpu/hemera-hzdr/fwkt_v100.tpl
.
(This template file is directly located in your project directory.`)
Both profile and template file are built for the same compute device, the NVIDIA Volta “V100” GPU.
Examine the Results
Results are located at $SCRATCH/tinkering/try01/run01
.
To view pretty pictures from a linux workstation you can use the following process (execute on your workstation, not the HPC cluster):
# Create a “mount point” (empty directory)
mkdir -p ~/mnt/scratch
# Mount the data directory using sshfs
sshfs -o default_permissions -o idmap=user -o uid=$(id -u) -o gid=$(id -g) hemera5:DATADIR ~/mnt/scratch/
Substitute DATADIR with the full path to your data (scratch) directory, e.g. /bigdata/hplsim/external/alice
.
Browse the directory using a file browser/image viewer.
Check out ~/mnt/scratch/tinkering/try01/run01/simOutput/pngElectronsYX/
for image files.
Further Reading
You now know the process of using PIConGPU. Carry on reading the documentation to understand it.