About the role
Manage a large team of Production Support Personnel across 3 geographical locations, implement SRE principles, ensure SLA adherence, reduce MTTR, automate manual tasks, and drive stability and continuous improvement in banking applications.
BankingRemote
Key Responsibilities
- Manage a large team of Production Support Personnel across 3 geographical locations
- Ensure SLAs on Alerts and Incidents are proactively managed and reduce Mean Time To Recover (MTTR) by 20%
- Ensure strict adherence to Standard Operating Procedure for recovery
- Deliver a playbook for onboarding on new tasks/activities to Production support
- Identify opportunities to automate Production support activities and reduction in manual activities
- Application improvements ranging from performance and operational improvements, identification and remediation of system and automate Toils
Requirements
- 10-12 years of strong experience in the Banking industry with minimum 5+ years in Run-the-Bank (RTB) lead role with a proven track record of working in Banking environment
- Implement Site Reliability Engineering principles with regards to performance, reliability, monitoring, alerting and maintenance in Production environment
- Pro-active Capacity monitoring & Observability of production Infrastructure, automated alerting, performance monitoring and reporting tools
- Automation of manual tasks in a Production Support
- Build and maintain Production monitoring and automation solutions
- Build and implement Service improvements. Identify, measure and report performance trends – SLIs/SLOs/SLAs periodically and improve systems performance and associated performance KPIs