About the role
This senior-level Cloud SRE position focuses on managing and optimizing high-performance Linux environments and cloud infrastructure. The role requires a strong background in automation, Infrastructure as Code, and reliability engineering to support critical manufacturing and design software platforms.
BankingOnsite
Key Responsibilities
- Design, implement, and maintain scalable cloud infrastructure on AWS and Azure platforms
- Develop and optimize automated deployment pipelines using CI/CD tools and Infrastructure as Code
- Lead incident management and perform detailed root cause analysis to improve system availability
- Monitor system performance and resource utilization using Prometheus and Grafana for proactive scaling
- Manage Linux-based environments including kernel tuning, package management, and security hardening
- Collaborate with software engineering teams to define and implement Service Level Objectives (SLOs)
- Automate manual operational tasks through high-quality Python or Go scripting
- Drive architecture reviews and provide technical leadership for cloud-native migrations
- Implement and manage container orchestration platforms primarily focused on Kubernetes
Requirements
- Minimum of 8 years of professional experience in Site Reliability Engineering or Systems Administration
- Deep expertise in Linux operating system internals and command-line utilities
- Proven experience with Infrastructure as Code (IaC) using Terraform or CloudFormation
- Strong proficiency in automation scripting using Python, Bash, or Go
- Hands-on experience managing production workloads in AWS, GCP, or Azure environments
- Solid understanding of containerization and orchestration technologies like Docker and Kubernetes
- Experience with configuration management tools such as Ansible, SaltStack, or Chef
- Familiarity with networking protocols including TCP/IP, DNS, and load balancing configurations
- Knowledge of modern monitoring, logging, and alerting systems
- Strong problem-solving skills and the ability to work in a fast-paced environment
- Bachelor's degree in Computer Science or a related technical field
- Ability to participate in an on-call rotation for production support