Job Description
Core Mandatory Skills
- Strong Python programming (modular design, error handling, logging)
- Advanced SQL (joins, window functions, optimization)
- Hands-on experience with Pandas and Kafka for data processing
- Experience with orchestration Tools: Apache Airflow
- Prefect (or equivalent)
- Experience with package and dependency management (pip, virtual environments)
Key Competencies Expected
- Pipeline Development & Orchestration- Design, build, and maintain data pipelines using Python, SQL, and orchestration tools
- Develop and manage Directed Acyclic Graph (DAGs) / flows in using orchestration tools like Apache Airflow and Prefect
- Ensure pipelines are idempotent, scalable, and fault-tolerant
- Implement logging, monitoring, and alerting for pipeline observability
- Package & Dependency Management
- Install, upgrade, and manage Python packages in controlled environments
- Maintain e.g. ************* / dependency manifests with version pinning
- Resolve dependency conflicts and ensure compatibility across environments (dev, UAT, prod)
- Support deployments in restricted or air-gapped environments where require
- Security Remediation & Library Fixes- Analyse vulnerability reports from security scanning tools (e.g., CVE findings)
- Upgrade or replace vulnerable libraries while maintaining pipeline stability
- Fix broken imports, deprecated APIs, and compatibility issues arising from library updates
- Collaborate with security teams to ensure compliance with organisational standards
- Code Refactoring & Optimization- Refactor legacy code across:- Data ingestion APIs
- Data transformation (Pandas/SQL)
- Model training and inference pipelines
- Orchestration workflows
- Improve code modularity, readability, and performance
- Ensure backward compatibility and minimal disruption to production systems
- Data Processing & Integration- Perform data transformation and validation using Pandas and SQL
- Integrate streaming data pipelines using Kafka (producers/consumers)
- Ensure schema consistency and data quality across pipeline stages
- Testing, Deployment & Support- Implement unit and integration tests for pipelines
- Support workflows for deployment of data pipelines
- Troubleshoot pipeline failures and perform root cause analysis
- Provide production support and continuous improvement of data workflows
- Streaming and Integration Skills- Working knowledge of Kafka (topics, partitions, consumers, producers)
- Experience handling schema evolution and message serialization/deserialization
- Platform Awareness Skills- Working knowledge of Kafka (topics, partitions, consumers, producers)
Pay: $9,000.00 - $9,700.00 per month
Benefits:
Work Location: In person