About the role
Manage a large team of Production Support Personnel across 3 locations, ensure SLAs, reduce MTTR, adhere to SOP, manage attrition, deliver playbook, identify automation opportunities, implement SRE principles, and drive incident and problem management.
BankingRemote
Key Responsibilities
- Manage a large team of Production Support Personnel (> 200 personnel) across 3 geographical locations
- Ensure SLAs on Alerts and Incidents are proactively managed and reduce in Mean Time To Recover (MTTR) by 20%
- Ensure strict adherence to Standard Operating Procedure for recovery
- Manage attrition within 10%
- Deliver a playbook for onboarding on new tasks / activities to Production support
- Identify opportunities to automate Production support activities and reduction in manual activities
Requirements
- 12 - 15 years of strong experience in the Banking industry with minimum 7+ years in Run-the-Bank (RTB) lead role with a proven track record of working in Banking environment
- Implement Site Reliability Engineering principles with regards to performance, reliability, monitoring, alerting and maintenance in Production environment. Pro-active Capacity monitoring & Observability of production Infrastructure, automated alerting, performance monitoring and reporting tools
- Automation of manual tasks in a Production Support
- Build and maintain Production monitoring and automation solutions
- Build and implement Service improvements. Identify, measure and report performance trends – SLIs/ SLOs/ SLAs periodically and improve systems performance and associated performance KPIs
- Sound understanding of RDBMS / Unix / Cloud/ Large banking applications