Senior Data Engineer (Real-Time Streaming)
About Us
Asgard is a Multiple award-winning local recruitment company specialising in connecting top talent with leading companies across the Tech, Banking, Financial Services, Insurance (BFSI), Oil & Gas, and Fast-Moving Consumer Goods (FMCG) sectors.
Role Overview
We are hiring on behalf of our client — a technology company building AI-powered industrial automation platforms — for a Senior Data Engineer to own the design and reliability of a real-time, high-throughput data platform processing large volumes of sensor and IoT data. This role goes beyond pipeline building; you'll be expected to make architectural decisions on streaming design, fault tolerance, and system trade-offs.
Key Responsibilities
- Design and operate real-time data pipelines for ingesting and processing high-volume sensor/IoT data
- Architect streaming solutions using Kafka (or equivalent: OGG, JDBC-based CDC) for ingestion and buffering
- Implement Spark Structured Streaming or Flink for real-time processing, with proper use of watermarking and windowing for late data and aggregations
- Ensure exactly-once processing semantics across the pipeline — including idempotent/transactional producers and offset management
- Build fault-tolerant systems using checkpointing and recovery strategies
- Output processed data to Delta Lake or equivalent cloud storage (S3, ADLS), with clear rationale for storage choices
- Apply medallion architecture (bronze/silver/gold layering) for data organization and quality
- Evaluate and justify architectural trade-offs between Kappa and Lambda architecture patterns
- Optimize Spark jobs — partitioning, shuffling, debugging data skew and stragglers
- Implement structured logging, trace-based observability, and metrics-driven alerting
- Manage CI/CD pipelines across dev/staging/production environments
What We're Looking For
- 4+ years of experience in data engineering with strong exposure to real-time/streaming systems
- Deep working knowledge of Kafka — delivery semantics, offset mechanics, and handling out-of-order data
- Hands-on experience with Spark Structured Streaming or Apache Flink in production
- Strong understanding of Spark internals (logical/physical planning, query execution) — able to explain trade-offs, not just use the tool
- Practical experience with watermarking, windowing, and stateful stream processing
- Familiar with Delta Lake or similar lakehouse storage formats
- Able to clearly articulate Kappa vs. Lambda architecture decisions
- Understanding of medallion (bronze/silver/gold) data architecture
- Experience with structured logging and observability tooling (e.g. trace IDs, metrics-based alerting)
- Comfortable working with CI/CD and managing multi-environment deployments
Nice to Have
- Experience with time-series databases (e.g. InfluxDB) at scale
- On-prem to cloud data sync / edge computing exposure
- Exposure to SIEM/log correlation tools (e.g. Datadog, ELK)
What's On Offer
- Ownership of architectural decisions on a real-time data platform — not just maintenance work
- Work directly with IoT/sensor data powering live, mission-critical decisions
- Collaborative, fast-moving environment with genuine technical depth
- Competitive salary and flexible working arrangements
Pay: RM14,000.00 - RM16,000.00 per month
Benefits:
Application Question(s):
- 1. Have you personally designed or built a real-time streaming pipeline using Kafka (or a similar tool) in a production environment? (Yes / No)
- 2. Have you implemented exactly-once or at-least-once processing semantics in a Kafka-based pipeline before? (Yes / No)
- 3. Which stream processing framework have you used in production?
(Spark Structured Streaming / Apache Flink / Both / Neither)
- 4. Have you designed a data pipeline using the Lambda or Kappa architecture pattern before? (Yes / No)
- 5. Do you have hands-on experience tuning Spark jobs for performance — such as fixing data skew, partitioning, or memory/core configuration? (Yes / No)
Work Location: In person