10 ANSON ROAD Central Region (Singapore) Singapore
职位描述
岗位职责
Responsibilities
Design, develop, and implement advanced SRE tooling and automation solutions using Python and Java to improve system reliability and operational efficiency across the infrastructure lifecycle
Lead the adoption and continuous improvement of CI/CD pipelines by applying DevOps best practices and tools to enable rapid, secure, and reliable software delivery and infrastructure provisioning
Proactively monitor, troubleshoot, and optimize system performance and reliability by identifying and resolving complex incidents and implementing preventative automation and root cause analysis
Collaborate with development and operations teams to embed SRE principles, define and maintain Service Level Objectives (SLOs), and ensure scalable, resilient system architecture and disaster recovery readiness
Provide secondary support and technical expertise for messaging and middleware platforms, including Kafka and MQ, by assisting with administration, configuration, performance tuning, and incident resolution
Conduct architectural reviews to identify infrastructure gaps, remediate network vulnerabilities, and advise application teams on operational excellence, security, and compliance best practices
Required competencies and certifications
Demonstrated expertise in Python and Java for automation, scripting, and SRE tool development
Hands-on experience in Linux system administration
Proven experience with CI/CD practices and DevOps toolchains such as Jenkins and GitLab CI
Proficiency in version control systems (GIT) and agile project management tools (Jira)
Strong understanding and application of SRE principles including SLOs, error budgets, monitoring, alerting, and incident management
Foundational to intermediate knowledge and practical experience with Kafka and MQ messaging and middleware technologies, including administration and troubleshooting
Exceptional written and verbal communication skills with a solid understanding of ITIL processes
Minimum of 3 to 6 years of progressive experience in DevOps, SRE, or technical infrastructure roles, preferably supporting mission-critical banking systems