Jobs in Singapore » Jobs in Singapore » Information Technology Job » Data Engineer(Python & Pyspark)

Data Engineer(Python & Pyspark)

Unison Consulting Pte Ltd

Job Type   /   Job Level

Full-time   /   Others/Any

Job Location

Singapore, Singapore, Singapore

Data Architecture and Design:

Collaborate with cross-functional teams to understand data requirements and design efficient and scalable data architectures.
Develop and maintain data models, schema designs, and data flow diagrams.

ETL Development:

Design, develop, and optimize Extract, Transform, Load (ETL) processes using Python and PySpark.
Implement robust data pipelines for efficient data extraction, transformation, and loading from various sources to data warehouses.

Data Processing and Transformation:

Leverage PySpark for large-scale data processing, ensuring high performance and reliability.
Implement data transformations, aggregations, and cleansing procedures to maintain data quality.

Data Integration:

Integrate data from various sources, including structured and unstructured data, ensuring consistency and accuracy.
Work closely with data scientists and analysts to understand their data needs and provide support for data integration into analytical models.

Performance Optimization:

Monitor and optimize data processing and ETL jobs for performance, scalability, and efficiency.
Troubleshoot and resolve issues related to data pipeline performance.

Data Quality and Governance:

Implement data quality checks and validation processes to ensure the accuracy and reliability of the data.
Enforce data governance policies and best practices.

Collaboration and Documentation:

Collaborate with cross-functional teams including data scientists, analysts, and business stakeholders to understand and address their data requirements.
Document data engineering processes, ETL workflows, and data architectures.

Technology Research and Adoption:

Stay abreast of the latest trends and advancements in data engineering and recommend the adoption of new technologies and tools to enhance efficiency.

Minimum of 7 years of hands-on experience in data engineering, with a focus on Python and PySpark.
Proven experience in designing and implementing scalable and efficient ETL processes.
Strong knowledge of data modeling, data warehousing concepts, and database systems.
Experience with big data technologies and distributed computing, with a focus on PySpark.
Proficiency in working with cloud platforms such as AWS, Azure, or Google Cloud.
Strong problem-solving and troubleshooting skills.

✱ This job post has expired ✱

Save