SDSS-CC Resources: Overview

Sherlock

The Sherlock HPC system is the University’s compute cluster, purchased and supported with seed funding from the Provost, and available for use by all Stanford faculty and their research teams. Sherlock offers free compute cycles to Stanford researchers, and also allows PI’s to purchase dedicated resources. Sherlock is maintained by Stanford Research Computing Center (SRCC); more information can be found at https://www.sherlock.stanford.edu. Sherlock’s collaborative, shared resource approach facilitates scales of computing, varieties of available software, and levels of support that are not easily achieved by individual research groups or schools.

See also this OnBoarding slide for more information: SDSS-CfC_onboarding_20230419.pdf

Sherlock SERC Partition and Oak Storage:

In addition to generall access to the public Sherlock compute reources ( normal, gpu, dev, bigmem, and owners partitions), SDSS users may also submit jobs to the serc partition on Sherlock, and storage is available on SRCC’s oak platform. More information on how to access Sherlock and the serc, partition can be found in the Sherlock and Oak documentation.

The Sherlock cluster includes a broad, capable variety of computing tools. It is difficult to say exactly how big, and what specific resources are available, because it is constantly in flux as users subscribe to the system, nodes are added, and old nodes are swapped out for new ones. SDSS-CFC Sherlock resources include:

Traditional HPC “batch” computing, managed by SLURM
Interactive sessions, including multi-core instances
serc partition:
- 128 x SH4_CBASE: 24 CPUs (1 x AMD Epyc 8224p), 192 GB RAM
- 16 x SH4_CPERF: 64 CPUs (2 x AMD Epyc 9384X), 384 GB
- 4 x SH4_CSCALE: 256 CPUs (2 xy AMD Epyc 9754), 1.5 TB
- 1 x SH4_G8TF64: 8 NVIDIA H100 GPUs, 64 CPUs
- 200 x SH3_CBASE 32 core (AMD Epyc 7502), 256 GB RAM
- 8 x SH3_CPERF 128 core (AMD Epyc 7742) 1024 GB RAM
- ~~24 x 24 core (Intel Skylake), 192/384 GB RAM~~ (To be decommissioned some time in 2025)
- 10 x 8 NVIDIA Tesla A100 GPUs, 128 CPU cores (AMD Epyc 7662), 1024 GB RAM
- 2 x 4 NVIDIA Tesla A100 GPUs, 64 CPU cores (AMD Epyc), 512 GB RAM
- ~~2 x 4 NVIDIA Tesla V100 GPUs, 24 CPU cores (Intel Skylake), 192GB RAM~~ (To be decommissioned some time in 2025)
Sherlock owners partition: Access to idle reseroudes owned by other PI groups.
Public partitions: normal, gpu, bigmem, dev
Oak 1.35PB storage: /oak/stanford/schools/ees/{PI SUNetID}
ssh (requires 2-factor auth):
- $ ssh sherlock.stanford.edu

Google Cloud Platform (GCP)

For jobs, projects, and storage not well suited to shared HPC, like Sherlock and Oak, Cloud resources might be available. Google Cloud Platform (GCP) is SDSS’s principal Cloud computing provider, and GCP allocations are made on a case-by-case basis and are typically allocated to accommodate:

Websites and data portals
Specialized compute or hardware requirements
Lower performance GPUs, for development and other less compute intensive applications

If you your group has a project that is not well served by shared HPC, please contact SDSS-CC staff.

More Information:

Sherlock homepage: https://www.sherlock.stanford.edu
Sherlock support docs: https://www.sherlock.stanford.edu/docs/overview/introduction/
To view partition information: sinfo --Node --long --partition=serc

More Information:

SDSS-CC website: https://sdss-compute.stanford.edu