OCBC

Cloud SRE Engineer - Linux (VP/AVP)

OCBC
BankingOCBC SingaporeOnsitePosted 2 weeks ago

About the role

This senior-level Cloud SRE position focuses on managing and optimizing high-performance Linux environments and cloud infrastructure. The role requires a strong background in automation, Infrastructure as Code, and reliability engineering to support critical manufacturing and design software platforms.

BankingOnsite

Key Responsibilities

  • Design, implement, and maintain scalable cloud infrastructure on AWS and Azure platforms
  • Develop and optimize automated deployment pipelines using CI/CD tools and Infrastructure as Code
  • Lead incident management and perform detailed root cause analysis to improve system availability
  • Monitor system performance and resource utilization using Prometheus and Grafana for proactive scaling
  • Manage Linux-based environments including kernel tuning, package management, and security hardening
  • Collaborate with software engineering teams to define and implement Service Level Objectives (SLOs)
  • Automate manual operational tasks through high-quality Python or Go scripting
  • Drive architecture reviews and provide technical leadership for cloud-native migrations
  • Implement and manage container orchestration platforms primarily focused on Kubernetes

Requirements

  • Minimum of 8 years of professional experience in Site Reliability Engineering or Systems Administration
  • Deep expertise in Linux operating system internals and command-line utilities
  • Proven experience with Infrastructure as Code (IaC) using Terraform or CloudFormation
  • Strong proficiency in automation scripting using Python, Bash, or Go
  • Hands-on experience managing production workloads in AWS, GCP, or Azure environments
  • Solid understanding of containerization and orchestration technologies like Docker and Kubernetes
  • Experience with configuration management tools such as Ansible, SaltStack, or Chef
  • Familiarity with networking protocols including TCP/IP, DNS, and load balancing configurations
  • Knowledge of modern monitoring, logging, and alerting systems
  • Strong problem-solving skills and the ability to work in a fast-paced environment
  • Bachelor's degree in Computer Science or a related technical field
  • Ability to participate in an on-call rotation for production support