About the role
The Platform Operations Engineer is responsible for maintaining and supporting critical on-premises infrastructure, virtualization platforms, and storage systems. The role focuses on implementing modern operational practices, infrastructure automation, and Infrastructure as Code (IaC) to ensure platform stability and reliability. Key activities include monitoring, performance optimization, and providing L2/L3 technical support for complex IT environments.
ConsultingOnsite
Key Responsibilities
- Maintain critical infrastructure platforms including compute, storage, virtualisation, and supporting systems across development, staging and production environments
- Follow and implement platform standards, executing infrastructure automation and modern operational practices to improve efficiency and reliability
- Support platform enhancement initiatives and implementation of new infrastructure solutions, ensuring alignment with enterprise architecture standards
- Manage virtualisation platforms (e.g., VMware, Hyper-V), including capacity monitoring, performance optimisation, and lifecycle management
- Implement and maintain robust monitoring and observability solutions for all platform components using modern tooling (e.g., Prometheus, Grafana, ELK stack)
- Execute platform patching strategies, leveraging automation to maintain security and stability while minimising service disruption
- Provide L2/L3 technical support for platform-related incidents, conducting problem determination and resolution
- Implement Infrastructure as Code (IaC) practices to automate platform provisioning and configuration management
- Maintain backup, DR, and high-availability solutions for critical platform components
- Follow security controls implementation, including access management, security hardening, and compliance monitoring
Requirements
- Strong experience with enterprise virtualisation platforms (VMware vSphere, Hyper-V)
- Experience in storage systems (SAN, NAS) and enterprise backup solutions
- Proficiency in Linux and Windows Server administration
- Experience with infrastructure automation tools (Ansible, Puppet, Chef)
- Knowledge of container technologies (Docker, Kubernetes)
- Familiarity with monitoring and observability platforms
- Experience with Infrastructure as Code practices
- Understanding of networking concepts and technologies
- Scripting abilities (Python, PowerShell, Bash)
- Experience with high-availability and disaster recovery solutions
- Bachelor's degree in Computer Science, Information Technology, or related field
- Experience in infrastructure operations and engineering
- Excellent problem-solving and analytical skills
- Effective communication skills with both technical and non-technical stakeholders