Your key responsibilities
- Design, build, and maintain efficient, reusable, and scalable ETL/ELT pipelines for structured and unstructured data.
- Develop and optimize data workflows to support advanced analytics, machine learning models, and reporting tools.
- Collaborate with data scientists, analysts, and business stakeholders to gather requirements and ensure data quality and availability
- Work with cloud and on-prem data platforms (e.g., AWS, Azure, GCP, Hadoop, or on-prem SQL/NoSQL systems).
- Ensure data integrity, governance, and security best practices across pipelines and data lakes/warehouses.
- Troubleshoot and resolve data issues and performance bottlenecks in real-time and batch pipelines.
- Monitor job performance and implement automation and alerting for data operations.
- Contribute to documentation, code reviews, and development best practices.
Skills and attributes for success
- 2–6 years of experience in data engineering or related roles.
- Proficient in SQL, Python, or Scala for data processing.
- Experience with data orchestration tools like Apache Airflow, DBT, or Luigi.
- Familiarity with cloud platforms (AWS Glue, Azure Data Factory, GCP Dataflow, etc.).
- Hands-on experience with data warehouses such as Snowflake, BigQuery, or Redshift.
- Understanding of data modeling, normalization, and data warehousing concepts.
- Exposure to CI/CD, version control (Git), and agile development practices.
To qualify for the role, you must have
- Bachelor's or Master's degree in Computer Science, Engineering, or related field.
- Knowledge of streaming data frameworks (Kafka, Spark Streaming, Flink) is a plus.
- Experience working in consulting or client-facing environments (for Senior Associate roles).
- Certifications in AWS/Azure/GCP or specific data tools.
Your key responsibilities
- Design, build, and maintain efficient, reusable, and scalable ETL/ELT pipelines for structured and unstructured data.
- Develop and optimize data workflows to support advanced analytics, machine learning models, and reporting tools.
- Collaborate with data scientists, analysts, and business stakeholders to gather requirements and ensure data quality and availability
- Work with cloud and on-prem data platforms (e.g., AWS, Azure, GCP, Hadoop, or on-prem SQL/NoSQL systems).
- Ensure data integrity, governance, and security best practices across pipelines and data lakes/warehouses.
- Troubleshoot and resolve data issues and performance bottlenecks in real-time and batch pipelines.
- Monitor job performance and implement automation and alerting for data operations.
- Contribute to documentation, code reviews, and development best practices.
Skills and attributes for success
- 2–6 years of experience in data engineering or related roles.
- Proficient in SQL, Python, or Scala for data processing.
- Experience with data orchestration tools like Apache Airflow, DBT, or Luigi.
- Familiarity with cloud platforms (AWS Glue, Azure Data Factory, GCP Dataflow, etc.).
- Hands-on experience with data warehouses such as Snowflake, BigQuery, or Redshift.
- Understanding of data modeling, normalization, and data warehousing concepts.
- Exposure to CI/CD, version control (Git), and agile development practices.
To qualify for the role, you must have
- Bachelor's or Master's degree in Computer Science, Engineering, or related field.
- Knowledge of streaming data frameworks (Kafka, Spark Streaming, Flink) is a plus.
- Experience working in consulting or client-facing environments (for Senior Associate roles).
- Certifications in AWS/Azure/GCP or specific data tools.
What we look for
- Highly motivated individuals with excellent problem-solving skills and the ability to prioritize shifting workloads in a rapidly changing industry.
- An effective communicator, you'll be a confident leader equipped with strong people management skills and a genuine passion to make things happen in a dynamic organization.