Singapore Airlines

Information Technology - Information Technology - Lead Software Engineer (AI Ops and Resilience)

Singapore Airlines
AviationSearch by LocationOnsitePosted 1 month ago

About the role

The Lead Software Engineer for AI Ops and Resilience will drive the evolution of IT operations through intelligent automation and hands-on engineering. This role focuses on reimagining ITSM practices and developing predictive, self-healing IT capabilities using AI/ML frameworks and modern automation tools. Key duties include leading automation deployments, mentoring engineering teams, and overseeing IT Command Centre operations to ensure high service reliability.

AviationOnsite1587

Key Responsibilities

  • Reimagine and enhance core ITSM practices (Incident, Problem, Change, and Knowledge Management) using modern development frameworks and automation tools.
  • Design, prototype, and implement AI-driven operational tools, including predictive incident detection, automated remediation workflows, and LLM-based knowledge agents.
  • Lead the development and deployment of custom automation solutions to improve IT service reliability and reduce manual workload across ITSM domains.
  • Collaborate with platform teams, enterprise architects, and developers to conceptualize and build next-generation IT operational capabilities.
  • Provide mentorship and guidance to ITSM IPC Engineers, ensuring effective execution and governance of processes aligned with ITIL best practices.
  • Act as the primary liaison between internal stakeholders and external service providers for the IT Command Centre and Helpdesk.
  • Monitor and manage performance of vendor-managed services to ensure SLA and KPI compliance.
  • Participate in service reviews, audits, and performance assessments while supporting escalation management and root cause analysis efforts.

Requirements

  • Bachelor's Degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • 5+ years of experience in IT operations or substantial exposure to ITSM processes and tooling.
  • Strong understanding of ITIL framework and ITSM best practices; ITIL v3/v4 certification is preferred.
  • Hands-on experience with automation tools, scripting, and AI/ML technologies relevant to IT operations.
  • Proficient with ITSM platforms such as ServiceNow, BMC Remedy, or similar tools.
  • Demonstrated ability to mentor technical teams and lead cross-functional collaboration.
  • Excellent problem-solving, communication, and stakeholder management skills.
  • Hands-on software development or scripting experience in Python, JavaScript (Node.js), or similar languages.
  • Experience with monitoring and observability platforms like Splunk, Grafana, ScienceLogic, or equivalent is advantageous.
  • Familiarity with CI/CD pipelines, GitOps practices, cloud platforms (AWS, Azure, GCP), and Infrastructure-as-Code (IaC) tools.
  • Proficiency with AI/ML frameworks and tools such as TensorFlow, scikit-learn, LangChain, and OpenAI APIs is a strong advantage.