Software and Modules
Because HPC platforms service large numbers of users and broad spectra of scientific and computational discipilines, they must support an astronomical (pun acknowledged…) domain of software. Decades of experience managing shared HPC have taught administrators two imporant lessons:
- There is no one, perfect (or even functional) configuration.
- It is impractical for a sys admin – or even a team of admins, to attempt to dictate the best research software solutions.
Consequently, HPC administration has adopted tools and practices to provide central repositories of core packages, while permitting users to compile, install, mix, and match the software they need to do research. The general principles and tools employed include:
- Users will not have
rootaccess and cannot install software into the (typically) default, system locations (eg.,/usr/local). Note this means that package managers such asapt, apt-get, yum, and brewwill not be available to ordinary users. - Users can still install software to various locations, including
$HOME,$GROUP_HOME,$SCRATCH,$GROUP_SCRATCH, or any other space where the user has write and execute privilegse. - NOTE: The specific paths available for (2) above will vary from one HPC platform to another. Also, installing to these paths may require setting some environment variables (eg
$PATH,$LD_LIBRARY_PATH) to make the software available. - Software is available as “modules”, using GNU
modules,LMOD, or some other modules platform. These systems make software available, enabling users to configure their software environment, primarily by setting environment and path variables. - Modules are not just for admins! Users can develop module scripts for their own custom insalled software.
Modules
In most HPC environments – Sherlock included, the module system should be the first place to look for software.
Sherlock uses the popular LMOD module system. Module scripts are written for individual packages, or sometimes to make a group of package
available together.
To see available modules,
module avail
To look for a module, for example a module “like” NetCDF, use module spider, for example:
module spider netcdf
Sherlock may include some hierarchical module stacks, and sets of modules that take advantage of LMOD’s built in package, version interpretation.
Typically, LMOD/lua modules are stored and interepred like {package_name}/{version}.lua, and LMOD is smart enough to interpret some incomplete entries. For example,
if my_sw is available in versions 2.1., 2.0.0, 1.9, module load my_sw/ will load v2.1, unless a different default is defined; module load my_sw/2.0 and
module load my_sw/1. will load versions 2.0.0 and 1.9, respectively.
In addition to the standard SW modules available on Sherlock, some SDSS specific software, including CMG, dune, seissol, and Schlumberger applications have been compiled (or installed) and can be loaded as modules. These modules can be activated by using the module use command:
module use /home/groups/sh_s-dss/share/sdss/modules/modulefiles
General purpose, comprehensive SW stacks may also be available, but they are currently very much in beta state. If standard Sherlock software modules are not sufficient to compile and maintain your software, please request support.
Note that software and modules may have somewhat rigorous compatibility requirements for various dependencies and supporting packages. Of most particular interest, some packages will have dependencies on a specific version of a compiler or MPI.
Custom modules
Custom modules can be maintained to facilitate access to lab- or personal-specific software or to load common bundles of software. Module files can be saved anywhere, but should follow certain standard operating procedures (SOPs), in order to be more easily discovered and interpreted, and to take advantage of LMOD’s directory based hierarchical system.
Common places to keep user or group modules might be ${HOME}/.local/modulefiles, ${HOME}/local/modulefiles, $GROUP_HOME/.local/modulefiles, etc. These modules would then be activated (made available to the module system) by exeduting a module use command, eg:
module use ${HOME}/.local/modulefiles
The best way to learn how to write module script is to 1) search the internet for “LMOD Lua” syntax documentation, and 2) to copy from module scripts from Sherlock. Many elements of most module scripts can be filled automatically, so it there is value in spending some time to write a good module script template. A quick way to find the path to a module script is to start with a module show, eg:
module show gcc/12.4.0
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/share/software/modules/devel/gcc/12.4.0.lua:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
prepend_path("PATH","/share/software/user/open/gcc/12.4.0/bin")
prepend_path("LIBRARY_PATH","/share/software/user/open/gcc/12.4.0/lib")
prepend_path("LIBRARY_PATH","/share/software/user/open/gcc/12.4.0/lib64")
prepend_path("LD_LIBRARY_PATH","/share/software/user/open/gcc/12.4.0/lib")
prepend_path("LD_LIBRARY_PATH","/share/software/user/open/gcc/12.4.0/lib/gcc/x86_64-pc-linux-gnu")
prepend_path("LD_LIBRARY_PATH","/share/software/user/open/gcc/12.4.0/lib64")
prepend_path("CPATH","/share/software/user/open/gcc/12.4.0/include")
prepend_path("MANPATH","/share/software/user/open/gcc/12.4.0/share/man")
pushenv("CC","gcc")
pushenv("FC","gfortran")
pushenv("F90","gfortran")
pushenv("F77","gfortran")
pushenv("CPP","cpp")
pushenv("CXX","c++")
whatis("Name: gcc")
whatis("Version: 12.4.0")
whatis("Category: devel, compiler")
whatis("URL: http://gcc.gnu.org")
whatis("Description: The GNU Compiler Collection includes front ends for C, C++, Fortran, Java, and Go, as well as libraries for these languages (libstdc++, libgcj,...).")
Note that fie first line shows where to find the module script; the following lines show what the script renders.
Example: A py-sherlock/ module
As an example of a bundling module, consider Sherlock’s modular implementation of Python. Executing module load python/3.9 will load python@3.9 only. By design, this excludes numpy, matplotlib, and other very common python packages. Python packages can then either be built by the user, either manually or using pip/pip3, or in some cases they can be loaded as modules – for example, numpy can be provided for python@3.9 via module load py-numpy/1.24.2_py39.
Rather than repeatedly loading a long list of py-* Python extension modules – which can become cumbersome and lead to mistakes, it might be preferable to bundle common Python configureations into module scripts. For example, this py-sherlock/ modlue would load Python and several common numerical libraries and plotting routines:
-- lua
--
-- NOTE: not all of these LMOD dependencies stack up properly , but they will probably be fine...
--
depends_on('devel', 'viz', 'math', 'python/3.9')
depends_on('py-numpy/1.24.2_py39')
depends_on('py-matplotlib/3.7.1_py39')
--depends_on('py-scipy/1.6.3_py39')
depends_on('py-scipy/1.10.1_py39')
--depends_on('py-numpy/1.20.3_py39')
depends_on('py-pandas/2.0.1_py39')
--depends_on('py-numba/0.54.1_py39')
--depends_on('py-h5py/3.7.0_py39')
--
-- spoof numba module:
--depends_on("py-numpy/1.20.3_py39")
prepend_path("PATH","/share/software/user/open/py-numba/0.54.1_py39/bin")
prepend_path("PYTHONPATH","/share/software/user/open/py-numba/0.54.1_py39/lib/python3.9/site-packages")
--
depends_on('py-jupyter/1.0.0_py39', 'py-ipython/8.3.0_py39')
-- depends_on('py-scikit-learn/1.0.2_py39')
--
-- and spoof py-scikit-learn/
--depends_on("py-numpy/1.20.3_py39")
--depends_on("py-scipy/1.6.3_py39")
prepend_path("PYTHONPATH","/share/software/user/open/py-scikit-learn/1.0.2_py39/lib/python3.9/site-packages")
--
pushenv("PYTHON", "/share/software/user/open/python/3.9.0/bin/python3")
-- this looks like a good idea, but ultimately does not work well for many applications. But this is how you set an alias...
--set_alias("python", "python3")
In order to use this script:
- Save the script to a standard location and following the LUA/LMOD convention, for example
${HOME}/local/modulefiles/py-sherlock/3.9.0.lua - Activate those modules (tell LMOD to look in that path for module scripts): `module use ${HOME}/local/modulefiles
module show py-sherlock/will now show the module- Load the module:
module load py-sherlock/
Compile and Install Software
Build packages, such as *.rpm and *.deb files will typically not work properly without sudo access. Users will not be granted sudo access on the HPC, so this approach will not be an option.
That said, most software can be installed and/or compiled to a user defined location. This is typically accompilished via something like:
./configure --prefix=$HOME/my_swcmake -DCMAKE_INSTALL_PREFIX:PATH=$HOME/my_swpip install --user my_sw_packageconda install my_sw_package
It may then be necessary to amend some path variables. Assuming that we install binaries (executables) and libraries to $HOME/my_sw/bin and $HOME/my_sw/lib, respectively,
in the command line (for temorary scope), in your ~/.bashrc, or even better in a module file:
export PATH=$HOME/my_sw/bin:$PATHexport LD_LIBRARY_PATH=$HOME/my_sw/lib:$LD_LIBRARY_PATH
Configure and Compile
Generally speaking, the configure-and-compile process includes the following steps:
- Obtain the source code
- Configure the build
- Compile the code or “build” the project
- Install the compiled binaries to appropriate locations.
All of these steps are achieved at various levels of automation, using a handful of common tools and custom hack-jobs.
Obtain Source Code
Source code might be downloaded from a project or company website. Even better, and increasingly commmon, source code can often be downloaded or “cloned” directly from a project GitHub (or similar) code repository. Automated software build systems like spack will pull source code directly from GitHub repositories.
Configure:
Especially for more complex software – that has multiple standard dependencies like mpi, netcdf, hdf5, or thread based parallelization, or numerical libraries like blas, lapack, or mkl, ./configure or cmake can be used to find dependencies and configure a Makefile to use them. If your source code includes a ./configure script or CMakeLists.txt file, use it. If it has both, it might be necessary to determine whether ./configure or cmake is the better choice.
Very simple (or poorly packaged…) codes may use neither ./configure nor cmake, and it may be necessary to modify a Makefile or set some environment variables to compile one system or another.
Build
The build step compiles and links the source code into executable binaries and libraries. For most software, this is the longest and most comptue intensive step. For large, complex codes, consider requesting relatively large resource allocations and running on multiple jobs. Note that this is almost always a single task (thread based parallel) process, and most compile jobs do not require a lot of memory, eg.
salloc --ntasks=1 --cpus-per-task=12 --mem-per-cpu=2g
make build
Note also that for most Makefiles, make implies make build. The build option usually is not explicitly necessary – unless it is.
Install
The install process copies, or “installs,” binaries and libraries to their intended location. On personal machines, this might be – by default, something like /usr/bin or /usr/lib. In an HPC environment, you will want to choose a smart location, for example, $GROUP_HOME/$USER/local/{sw_name}/{sw_version}. The “correct” place to install software is highly contextual, depending on – among other factors, your workflow habits, the type of SW, whether the SW will be used with more than one other SW package.
As above, note that with make, the install step is often implied. The command make will often, 1) find the default Makefile, 2) build all components, then 3) install product binaries to a designated target directory. Simialrly, there may be no install instructions written in the Makefile, in which case the install step is performaed manually by copying executables and libraries to a desired location
Simialrly, install often implies the build step to preceed it. In the end, all of these steps are defined by human developers (or maybe AI bots emulating human developers), so the specific implementation will vary.
Example: Abseil
Typical instructions for compiling software – as discussed above, read about like,
You’ll need a compiler and cmake, then do the standard:
module load gcc/12.4 cmake
cd ${SW_PATH}
mkdir build
cd build
cmake ../
make
make install
Rarely does this “just work,” but here is an example where it might. This example covers the three main phases of building SW in an HPC environment:
- Download the source code
- Compile and install the SW
- Write and enable a module script to “load” SW.
Downloading the Source Code
Even relatively simple compile jobs should be performed on a compute – not login, node. Request resources using some variation of,
sh_dev
or
salloc -p serc --cpus-per-task=4 --mem-per-cpu=4g --ntasks=1
Source code can be downloaded from project websites, often as a *.zip or some sort of .tar file – including compressed versions, *.tar.gz, *.tar.bz, etc.. As often – if not more, modern SW is best downloaded from a GitHub (or similar) repository. In this case,
git clone git@github.com:abseil/abseil-cpp.git
It is worth a quick review of the provider’s GitHub repository, https://github.com/abseil/abseil-cpp to better understand your options, with respect to what branches, tags, or other release versions are available. If a specicific version of the SW is preferred, selecting that branch or tag, then downloading the corresponding zip file is often a good option – see figure below.

To download this specifig tag, right-click Download Zip; select Copy Link, and use wget to download the file directly to your Sherlock location,
wget https://github.com/abseil/abseil-cpp/archive/refs/tags/20250814.1.zip
Configure and Compile
Either way, you will eventually end up with a subdirectory abseil-cpp,or if you ghose the 20250814.1 tag, that subdirectory might be abseil-cpp-20250814.1. Navigate to that directory and set up a cmake build:
module load cmake gcc/12.4.0
cd abseil-cpp
mkdir build
cd build
Compiling in a build environment is optional, but is an excellent way to isolate your build environmebnt and tools, so that you can ensure a clean start if if you need one. The “build” directory can be named anything. Similarly, in this example, we choose gcc/12.4.0 as a compiler, but other compilers might be a fine choice.
Now, configure and compile. Note that this example instructs cmake to install the software into, $GROUP_HOME/$USER – in other words, your “user” subdirectory in your $GROUP_HOME space. If this directory does not exist, the make process (below) will creat it. The cmake output for this package is mercifully short, so we have included it here
$ cmake ../ -DCMAKE_INSTALL_PREFIX=${GROUP_HOME}/${USER}/local/abseil
-- The CXX compiler identification is GNU 12.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /share/software/user/open/gcc/12.4.0/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX17
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX17 - Success
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX20
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX20 - Failed
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Configuring done (3.4s)
-- Generating done (5.4s)
-- Build files have been written to: /scratch/users/myoder96/R_dev/abseil/abseil-cpp-20250814.1/build
Some key elements to this output include:
cmakefound ac++(CXX) compiler, GNU 12.4.0, and it appears to work.cmakealso found apthreads
For more complex builds, it can be helpful to review the cmake output to confirm the correct mpi, netcdf, or other dependencies are found and being used. This can be espcially true for conda users, since conda might install versions of some depenencies that can conflict with a successful build. Once satisfied, with the configureation, compile and install:
make -j $SLURM_CPUS_ON_NODE
Note that the variable $SLURM_CPUS_ON_NODE is generated by the HPC scheduler (SLURM) to define the number of processors availabl to this resource allocation. Obviously, if your are not running a scheduler or your HPC uses a different scheduler, this variable will not be populated. Excluding the -j option will run make using a single processor; substitutions like -j 4 are obviously allowed; the number of “jobs” should typically equal the number of CPUs allocated to the node (to the job for a single task job).
Expect output like,
[ 1%] Building CXX object absl/time/CMakeFiles/time_zone.dir/internal/cctz/src/time_zone_fixed.cc.o
[ 1%] Building CXX object absl/debugging/CMakeFiles/utf8_for_code_point.dir/internal/utf8_for_code_point.cc.o
[ 1%] Building CXX object absl/debugging/CMakeFiles/leak_check.dir/leak_check.cc.o
[ 1%] Building CXX object absl/time/CMakeFiles/civil_time.dir/internal/cctz/src/civil_time_detail.cc.o
[ 1%] Building CXX object absl/base/CMakeFiles/strerror.dir/internal/strerror.cc.o
[ 2%] Building CXX object absl/base/CMakeFiles/log_severity.dir/log_severity.cc.o
[ 2%] Building CXX object absl/base/CMakeFiles/spinlock_wait.dir/internal/spinlock_wait.cc.o
[ 2%] Building CXX object absl/profiling/CMakeFiles/exponential_biased.dir/internal/exponential_biased.cc.o
[ 2%] Linking CXX static library libabsl_leak_check.a
[ 3%] Linking CXX static library libabsl_utf8_for_code_point.a
[ 3%] Built target utf8_for_code_point
...
[ 99%] Building CXX object absl/flags/CMakeFiles/flags_parse.dir/parse.cc.o
[100%] Linking CXX static library libabsl_hashtable_profiler.a
[100%] Built target hashtable_profiler
[100%] Linking CXX static library libabsl_flags_parse.a
[100%] Built target flags_parse
If the job completes with errors, find the first error – subsequent errors probabyl cascade from the first one; resolve it, then repeat. It may be necessary, from time to time, to start the compile “clean,”
make clean
make -j $SLURM_CPUS_ON_NODE
in some cases, a very clean re-build:
cd ..
rm -rf build
mkdir build
cd build
cmake ../ -DCMAKE_INSTALL_PREFIX=${GROUP_HOME}/${USER}/local/abseil
make -j $SLURM_CPUS_ON_NODE
When the build proces completes successfully,
make install
In this case, make repeats the build step – so we probably could have just skipped to make install; we see output like,
```[ 1%] Built target log_severity [ 2%] Built target raw_logging_internal [ 3%] Built target spinlock_wait [ 5%] Built target base …
Where it verifies that the components have been compiled, then it installs:
[100%] Built target cordz_sample_token Install the project… – Install configuration: “” – Installing: /home/groups/**/&&&&&&/local/abseil/lib64/pkgconfig/absl_atomic_hook.pc – Installing: /home/groups/**/&&&&&&/local/abseil/lib64/pkgconfig/absl_errno_saver.pc – Installing: /home/groups/**/&&&&&&/local/abseil/lib64/pkgconfig/absl_log_severity.pc – Installing: /home/groups/**/&&&&&&/local/abseil/lib64/libabsl_log_severity.a … – Installing: /home/groups/**/&&&&&&/local/abseil/include/absl/debugging/internal/bounded_utf8_length_sequence.h – Installing: /home/groups/**/&&&&&&/local/abseil/include/absl/debugging/internal/stacktrace_aarch64-inl.inc – Installing: /home/groups/**/&&&&&&/local/abseil/include/absl/debugging/symbolize_elf.inc – Installing: /home/groups/**/&&&&&&/local/abseil/include/absl/base/options.h
A quick check of the target directory further suggests success:
$ ls -lh /home/groups/**/&&&&&&/local/abseil/ total 64K drwxr-sr-x 3 &&&&&& ** 22 Nov 17 10:52 include drwxr-sr-x 4 &&&&&& ** 4.0K Nov 17 10:52 lib64
## Writing a module
In order to use this SW, we need to tell the OS where to find the binaries and libraries of interest. On any system, but especially in HPC, we do this by appending vaious environment variables. In particular the following variables each define a list of directories to be automatically searched for:
- *PATH*: Executable binaries
- *LD_LIBRARY_PATH*: Linking or run-time libraries
- *LIBRARY_PATH*: Run-time libraries
- *CPATH*: Compile time "include" files
We can set these manually. For example, in a script that runs code that uses `abseil`, we might add:
ABSEIL_PATH=/home/groups/**/&&&&&&/local/abseil/ LD_LIBRARY_PATH=${ABSEIL_PATH}/lib:${LD_LIBRARY_PATH} LIBRARY_PATH=${ABSEIL_PATH}/lib:${LIBRARY_PATH} CPATH=${ABSEIL_PATH}/include:${CPATH}
This might be inconvenient if we have multiple scripts that use this code, if we move the compiled code, or if we compile an upgraded version of `abseil`. To simplify this process, we can write our own module script, that uses the HPC's software `module` system -- for example, Sherlock uses `LMOD`. Module script can be fairly advanced, but in their more basic form, the idea is to set some environment variables and append the various `PATH` variables to make a piece of SW more visible to the operating system. Module systems typically define software names and versions from the directoy structure and filenamem of the software,
{MODULE_PATH_ROOT}/{SW_Name}/{SW_VERSION}
For example, on Sherlock the CMG software module includes several versions:
$ tree /home/groups/sh_s-dss/share/sdss/modules/modulefiles/CMG /home/groups/sh_s-dss/share/sdss/modules/modulefiles/CMG ├── 2020.109.lua ├── 2022.101.lua ├── 2023.101.lua ├── 2023.108.lua ├── 2023.113.lua └── 2025.104.lua
The preferred scripting language for LMOD is Lua, and there are a special set of Lua functions for writing LMOD scripts. Fortunately, LMOD uses the directory structure to define many of a module's attributes and LMOD-Lua indcludes functionality that capitalizes on this structure to streamline module templates. For example, we start with a general purpose template, to which we add a few modifications specific to this SW. First, create a `modulepaths` directory -- of one does not already exist and make necessary modifications. In this case, we expect that our SW will require `gcc/12.4.0`,
mkdir -p $GROUP_HOME/$USER/modules/modulefiles/abseil cp /home/groups/sh_s-dss/share/sdss/modules/modulefiles/lua_template_user.txt $GROUP_HOME/$USER/modules/modulefiles/abseil/20250814.1.lua vim $GROUP_HOME/$USER/modules/modulefiles/abseil/20250814.1.lua
cat /home/groups/sh_s-dss/share/sdss/modules/modulefiles/lua_template_user.txt – -- lua -- – – vim:ft=lua:et:ts=4 – TODO: save/copy as {package_name}/version.lua; add any custom tidbits. – local pkg = {} local app = {}
– get module name/version and build paths pkg.name = myModuleName() pkg.version = myModuleVersion() pkg.id = pathJoin(pkg.name, pkg.version)
– open or restricted software pkg.lic = “open”
– app paths – note: we usually build with gcc/12.4… app.root = pathJoin(os.getenv(“GROUP_HOME”), “local/software/no_arch/gcc/12.4.0/”, pkg.name, pkg.version) – – Some of these, eg. lib64, might not be relevant. Samd goes for “– set paths” app.bin = pathJoin(app.root, “bin”) app.lib = pathJoin(app.root, “lib”) app.lib64 = pathJoin(app.root, “lib64”) app.incl = pathJoin(app.root, “include”) app.man = pathJoin(app.root, “share/man”)
– Dependencies: depends_on(‘gcc/12.4.0’)
– set paths prepend_path(“PATH”, app.bin) prepend_path(“LIBRARY_PATH”, app.lib) prepend_path(“LIBRARY_PATH”, app.lib64) prepend_path(“LD_LIBRARY_PATH”, app.lib) prepend_path(“LD_LIBRARY_PATH”, app.lib64) prepend_path(“CPATH”, app.incl) prepend_path(“MANPATH”, app.man) prepend_path(“PKG_CONFIG_PATH”, pathJoin(app.lib, “pkgconfig”))
– set env pushenv(string.upper(pkg.name) .. “_ROOT”, app.root)
– module info whatis(“Name: “ .. pkg.name) whatis(“Version: “ .. pkg.version) – whatis(“Category: “ .. “devel, compiler”) whatis(“URL: “ .. “”) whatis(“Description: “ .. “Module for “ .. pkg.name .. “@” .. pkg.version .. “.”) –
To use that module (and the SW), enable that subset of modules and load a module (note the `--ignore-cache` option is only needed after a module has changed):
module use $GROUP_HOME/$USER/modules/modulefiles/abseil/20250814.1.lua module –ignore-cache load abseil/ ```