NICADD Compute Cluster

As of 2018, the NICADD cluster provides 700 processor slots (1.8-2.6 GHz) under the HT CONDOR batch system and 200 TB of shared disk space, with instant access to CERN, FERMILAB and OSG software libraries. The system also serves NICADD's data acquisition servers and desktops for collecting and analyzing test data for MU2E, DUNE, and beam experiments. Detailed hardware description.

Alma Linux 8 upgrade (2023)

We preparing the Nicadd cluster to run Alma Linux 8, to be in sync with the soon-expected NIU Metis system. The upgrade is scheduled for the last week of May 2023. Currently, two cluster nodes, pcms6, and pcms5, are upgraded to allow tests of critical applications.

Access to the upgraded nodes

  • ssh -Y
  • ssh pcms6 (this node is equipped with NVIDIA P4 GPU card)

Both pcms5 and pcms6 nodes are running Alma Linux 8.7, CVMFS (and thus cvmfs-based tools of ATLAS, CMS, and DUNE), and provide local ROOT and custom compiler/linker tools via modules:

  • ROOT: module load root/root-6.26.06-gcc-11.3.0-cuda-11.8-pythia8307-el8
  • GCC+OPENMPI+CUDA: module load openmpi/openmpi-4.1.1-gcc-11.3.0-cuda-11.8-el8
  • INTEL-ONEAPI: module load intel/intel-oneapi-2022.1.2-el8

Both nodes are included in the mini-cluster under HT-Condor 10.0.2 batch system; test jobs are welcome.

Nodes configuration

  • Interactive login nodes (1 Gbit/s public uplink, accessible via ssh)
    • (SL7) - ATLAS T3 login node
    • (SL7) - Beam group login node
    • (SL7) - CMS login/CUDA developer node

Note:: for wireless access within NIU please use the "NIUwireless" network (the "NIUguest" blocks ssh ports).

  • HTCONDOR batch system (1-4 Gbit/s internal network)
    • Scientific Linux 7.X, 1.5-2.5 GB of RAM per batch slot, up to 128 GB for special tasks
    • Nodes: pt3wrk0-pt3wrk4 (T3 ATLAS), phpc0-phpc2 (beam), pcms0-pcms6 (CMS), pnc0-pnc3 (shared)
  • InfiniBand subluster (SL7, ibhpc0 - ibhpc2)

A closer look at NICADD computing (slides).

Disk space configuration

  • /home/$USER - user's home area, 5GB initial quota
  • /bdata, /xdata - shared data partitions (80 TB each), for code development and analysis projects
  • /nfs/work/$nodename - local nodes scratches (1-4 TB), for batch jobs use

Backup policy.

We only provide the previous day's backup of /the home area. We strongly encourage users to use the GIT system for any code development and frequently backup essential data to remote locations.

Page last modified on March 22, 2023, at 01:54 PM EST

|--Created: Sergey A. Uzunyan -|- Site Config --|