- Kuala Lumpur, Kuala Lumpur Kuala Lumpur WP Kuala Lumpur Malaysia
Working Location
Job Description
Responsibilities
Key Responsibilities
1.Data Pipeline Development & Integration
Design, build, and maintain scalable, reusable ETL/ELT pipelines for batch and real-time data processing
Integrate data from multiple internal and external sources (databases, APIs, streaming platforms, third-party systems)
Ensure efficient data ingestion, transformation, and loading processes
2. Data Modeling & Transformation
Develop and maintain logical and physical data models (e.g., dimensional models, star/snowflake schemas)
Transform raw data into structured, analytics-ready datasets
Optimize data structures for performance, scalability, and usability
3. Data Quality & Reliability
Implement data validation, cleansing, and enrichment processes
Establish data quality frameworks, including rules, checks, and monitoring
Troubleshoot and resolve data inconsistencies, anomalies, and failures
Ensure high availability and reliability of data pipelines
4. Data Platform Utilization & Optimization
Work with cloud data platforms (e.g., Azure, AWS, GCP) and data services (e.g., Synapse, BigQuery, Redshift)
Optimize storage, query performance, and compute efficiency
Monitor pipeline performance and continuously improve throughput and latency
5. Collaboration & Data Enablement
Collaborate with data analysts, data scientists, architects, and business stakeholders to understand data requirements
Provide clean, well-structured datasets for reporting, dashboards, and machine learning use cases
Support self-service analytics by enabling easy data access and documentation
6.Data Governance & Security
Implement and adhere to data governance policies, standards, and best practices
Ensure compliance with data privacy, security, and regulatory requirements
Manage data access controls, lineage, and auditability
7. Automation & DevOps Practices
Automate data workflows using orchestration tools (e.g., Airflow, Azure Data Factory)
Implement CI/CD pipelines for data engineering processes
Apply version control and infrastructure-as-code practices
Promote reusable components and standardized frameworks
8. Monitoring, Logging & Observability
Set up monitoring, alerting, and logging frameworks for pipelines and datasets
Track data lineage, pipeline health, and system performance
Proactively identify and resolve production issues
9. Documentation & Knowledge Sharing
Maintain clear documentation for data pipelines, data models, and architecture
Define data definitions, metadata, and data catalogs
Share knowledge and best practices across teams
10. Continuous Improvement & Innovation
Stay current with emerging data engineering tools, technologies, and patterns
Recommend and implement improvements in architecture, performance, and cost efficiency
Contribute to evolving enterprise data strategy and standards
Requirement
Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field
Experience with real-time data streaming and processing
Familiarity with containerization tools such as Docker and Kubernetes
Knowledge of data modeling techniques (dimensional, star schema)
Experience with CI/CD pipelines for data workflows
Understanding of machine learning workflows and data preparation
Strong programming skills in Python, Java, or Scala
Experience with SQL and NoSQL databases
Hands-on experience with data pipeline tools (e.g., Apache Airflow, Talend, Informatica)
Familiarity with big data technologies (e.g., Hadoop, Spark, Kafka)
Experience with cloud platforms such as AWS, Azure, or Google Cloud
Understanding of data warehousing concepts (e.g., Snowflake, Redshift, BigQuery)
Knowledge of version control systems (e.g., Git)
Important Information
Never provide your bank or credit card details when applying for jobs. Do not transfer any money or complete unrelated online surveys. If you see something suspicious, Report this Job ad.