About the team & the platform you’ll own
Data Engineering is part of the Data Team. The team build, maintain, and continuously improve the company’s data infrastructure covering the Data Lake, CDC (Change Data Capture), compute clusters, and our workflow orchestrator (Apache Airflow) along with other analytics workloads that support decision-making across the business.
Today, we ingest and process tens of terabytes of data, with an aggregate workload of 1,000+ runtime hours per day across our pipelines and jobs. This role is for someone who gets excited about operating data systems at scale with reliability, performance, cost efficient, and clean engineering practices.
About the role
We’re looking for a Senior Data Engineer to design and build robust, scalable data systems with a reliability-first mindset and strong software engineering fundamentals. You’ll lead architecture and technical decisions for pipelines (batch + streaming where applicable), mentor other engineers, and raise engineering standards across the team.
This is a hands-on role: you’ll ship, operate, and improve production systems, not just design them.
What you’ll do:-
- Build and operate scalable data systems
- Design, build, and maintain scalable data pipelines (batch and streaming) that are reliable, observable, and cost-effective.
- Own end-to-end pipeline architecture: ingestion → processing → storage → serving, including data modeling and performance considerations.
- Improve and extend our core infrastructure: data lake, CDC pipelines, compute cluster workloads, and Airflow orchestration.
- Work deeply with distributed processing and data lake concepts, including performance tuning and stability at scale.
2. Engineering excellence & production readiness
- Develop in Python or at least one big-data language (e.g., Scala or Go), writing clean, modular, testable code.
- Apply strong software engineering practices: design patterns, trade-offs, DRY principles, dependency management, code reviews, and CI/CD.
- Raise the bar on documentation: architecture diagrams, data contracts, operational playbooks, runbooks, and decision records.
3. Reliability, observability, and incident ownership
- Define and operate system observability:
- establish metrics/dashboards (latency, throughput, failure rate, resource usage, SLA/SLO adherence)
- implement alerting + runbooks
- Lead root-cause analysis for complex incidents and recurring failures; implement permanent fixes (not just patches).
- Partner cross-functionally with analytics, product, platform, and DevOps teams to align data solutions with business needs.
4. Leadership & Mentoring
- Mentor and level up other engineers through pairing, reviews, technical guidance, and best-practice evangelism.
- Lead technical discussions, drive alignment, and make pragmatic decisions with clear trade-offs.
The “extra mile” mindset we value
We value engineers who don’t stop at “it works.” You’ll thrive here if you naturally:
- Stay with hard problems until the real root cause is found (not just symptoms).
- Use a “detective” approach: form hypotheses, validate with evidence, and iterate quickly.
- Go beyond your immediate area to unblock solutions, including:
- reading internal tooling or framework code when needed (and occasionally digging into upstream/open-source source code to understand behavior)
- collaborating across teams to trace system boundaries and ownership
- building reproducible test cases, simulations, or load tests to validate fixes and performance changes
- creating small tools/scripts to diagnose production issues or prevent regressions