All Jobs/HPC System Engineer, System, NSCC
A*STAR
A*STAR

HPC System Engineer, System, NSCC

National Supercomputing Centre

Location

Singapore

Department

National Supercomputing Centre

Posted

2 months before

About This Role

A*STAR HPC System Engineer role requires expertise in designing, optimizing, and maintaining NSCC's supercomputing infrastructure including compute, interconnects, and storage components.

Responsibilities

  • Evaluate HPC system architecture for compute, interconnects, and storage components.
  • Collaborate with administrators to ensure system reliability and performance.
  • Assist in performance tuning and root-cause analysis for complex issues.
  • Develop utility tools for system diagnostics and performance profiling.
  • Configure job schedulers (Slurm, PBS Pro) to maximize resource utilization and throughput.
  • Define security policies in collaboration with administrators.

Requirements

  • Degree in Computer Science, Engineering, IT or other relevant areas.
  • At least 3 years of experience in managing HPC systems.
  • Proficient in UNIX/Linux environments and command line interface (CLI).
  • Experience with cluster management software (xCAT, BCM, PHPC, HPCM).
  • Experience with job scheduling and workload management software (Slurm or PBS Pro).
  • Strong knowledge of HPC storage principles and parallel file systems (Lustre, GPFS, BeeGFS).
  • Understanding of RDMA-based interconnects (InfiniBand, RoCE).
  • Basic knowledge of network protocols like DHCP, DNS, TFTP.