We are seeking an experienced AWS Monitoring & Observability Engineer to design, implement, and standardize monitoring, alerting, logging, and dashboarding solutions across AWS environments. The successful candidate will play a key role in establishing operational visibility, improving incident response, and delivering reusable observability frameworks using AWS-native services and Infrastructure-as-Code (IaC) practices.
This role is ideal for professionals with strong expertise in AWS CloudWatch, AWS Managed Grafana, Terraform, and cloud monitoring best practices.
Key Responsibilities
Monitoring & Observability
- Design and implement end-to-end monitoring solutions across AWS environments.
- Configure and manage Amazon CloudWatch Metrics, Logs, Dashboards, and Alarms.
- Develop monitoring standards, naming conventions, and reusable templates.
- Implement centralized logging and log retention strategies.
- Establish monitoring and alerting best practices across cloud platforms.
Dashboard Development
- Develop and maintain reusable dashboards using AWS Managed Grafana.
- Create operational, infrastructure, application, security, and executive dashboards.
- Monitor AWS services including EC2, ECS/Fargate, EKS, RDS, Lambda, API Gateway, and Load Balancers.
- Implement dashboard templating and multi-environment support.
Infrastructure as Code
- Build and maintain reusable Terraform modules for monitoring solutions.
- Automate deployment of CloudWatch alarms, SNS notifications, and Grafana dashboards.
- Ensure monitoring configurations are scalable, repeatable, and version controlled.
- Collaborate with DevOps and engineering teams to integrate observability into deployment pipelines.
Security Monitoring
- Integrate AWS Security Hub and GuardDuty into monitoring frameworks.
- Build security-focused dashboards and alerting mechanisms.
- Support operational visibility and reporting of security findings.
Alerting & Incident Management
- Define monitoring thresholds and alerting strategies.
- Configure SNS-based notifications and escalation workflows.
- Improve signal-to-noise ratio through effective alert tuning and optimization.
- Support incident investigation and root-cause analysis activities.
Documentation & Knowledge Sharing
- Create implementation guides, runbooks, and operational procedures.
- Maintain monitoring architecture and configuration documentation.
- Conduct knowledge transfer sessions for operations and support teams.
Required Skills & Experience
AWS Services
Hands-on experience with:
- Amazon EC2
- ECS / AWS Fargate
- Amazon EKS
- AWS Lambda
- Amazon RDS
- DynamoDB
- API Gateway
- Elastic Load Balancer (ALB/NLB)
- IAM
Monitoring & Observability
- Amazon CloudWatch (Metrics, Logs, Alarms, Log Insights)
- AWS Managed Grafana
- AWS SNS
- Monitoring and observability best practices
- AWS X-Ray (preferred)
Infrastructure as Code
- Terraform (mandatory)
- Git
- CI/CD integration
- CloudFormation (preferred)
Security
- AWS Security Hub
- AWS GuardDuty
- AWS Well-Architected Framework
- AWS Security Best Practices
Preferred Qualifications
- AWS Certified Solutions Architect (Associate or Professional)
- Terraform Associate Certification
- Experience supporting enterprise-scale AWS environments
- Experience managing multi-account AWS deployments
Pay: Up to $7,000.00 per month
Work Location: In person