We are seeking a Senior Software Engineer / SRE to support platform reliability, monitoring, and modernization initiatives. This role combines software engineering and site reliability engineering, with a strong emphasis on Kubernetes, observability, and cloud infrastructure. You will help improve system reliability, automate operational workflows, and support the organization’s Datadog-based observability environment.
Key Responsibilities
Support platform reliability, monitoring, and modernization initiatives
Design, consume, and integrate APIs across internal systems and services
Work in Kubernetes environments across deployment, operations, and monitoring
Build and improve observability capabilities across dashboards, alerts, tracing, metrics, and logging
Monitor containerized and microservices-based architectures
Integrate observability tooling into AWS environments
Support CI/CD observability integrations
Automate monitoring and operational tasks using scripting, with Python preferred
Help own and operate internal engineering platform capabilities, with extra emphasis on observability platforms
Drive proactive maintenance efforts and platform improvements focused on reliability, scalability, and performance
Install and configure Datadog agents and integrations
Manage API keys and secure configuration practices for observability tooling
Manage user roles and access controls within observability platforms
Provide operational and training support related to Datadog
Qualifications
Must-Have Skills
Strong proficiency in at least one of the following: Python, JavaScript (Node.js), or Java
Hands-on experience with API integrations
Strong experience working in Kubernetes environments
Experience with Datadog or similar observability tools such as Prometheus or Grafana
Ability to configure dashboards, alerts, and APM
Experience with tracing, metrics, and logging
Experience monitoring containerized or microservices-based architectures
Hands-on experience with AWS
Experience integrating observability tools into cloud environments
Experience with CI/CD integrations for observability
Ability to automate monitoring and operational tasks using scripting
Demonstrated ownership of reliability, scalability, and performance improvements
Experience installing and configuring observability agents and integrations
Experience managing secure configurations, API keys, and user access within observability platforms
Enterprise experience strongly preferred.
Nice-to-Have Skills
Familiarity with Go (Golang)
Experience with New Relic
Experience with Dynatrace
Experience with Elastic
Experience with Splunk Observability
Required Tools & Platforms
Kubernetes
Datadog
AWS
Prometheus
Grafana
Additional Information
Location, Time & Engagement
Contract role
Full-time, 40 hours per week
Remote within APAC
Must be able to work hours overlapping U.S. Pacific Time