As a Data Engineer with expertise in Databricks, you will be responsible for designing, building, and maintaining scalable data pipelines and infrastructure in the cloud. You will play a key role in enabling data analytics, machine learning, and data-driven decision-making processes by leveraging the full potential of Databricks and its integration with big data technologies.
Key Responsibilities:
- Data Pipeline Development: Design, develop, and optimize robust ETL pipelines using Databricks and Apache Spark for processing large-scale datasets. Ensure data pipelines are scalable, efficient, and capable of handling both batch and streaming data
- Databricks Platform Expertise: Leverage the Databricks platform to perform data transformations, data integration, and advanced analytics. Use Databricks notebooks for building reusable data pipelines, automating workflows, and ensuring data quality
- Cloud Data Engineering: Build, deploy, and manage data infrastructure on cloud platforms (e.g., Azure, AWS, GCP) using Databricks and other cloud-native tools. Work with Delta Lake and data lakehouse architectures to enable reliable and real-time data analytics
- Data Modeling & Optimization: Develop and maintain data models that optimize query performance and support analytics use cases. Implement data partitioning, indexing, and caching strategies in Databricks to improve performance and reduce costs
- Collaboration with Data Teams: Collaborate with Data Scientists, Data Analysts, and other stakeholders to understand business requirements and translate them into scalable data solutions. Ensure data is easily accessible for analytics and reporting
- Big Data Processing: Implement and maintain big data pipelines using Apache Spark on Databricks, focusing on high-performance processing of large datasets. This includes batch processing, real-time streaming, and machine learning workflows
- Automation & Orchestration: Build automated data workflows using tools like Airflow or cloud-native orchestrators. Ensure data workflows are resilient, monitor performance, and handle failure gracefully
- Data Governance & Security: Ensure all data solutions adhere to organizational data governance and security standards. Implement role-based access controls, data masking, and encryption in Databricks to maintain data privacy and security
- Performance Tuning: Continuously monitor and optimize data pipeline performance. Utilize Databricks features such as auto-scaling, caching, and cluster management to ensure optimal cost-efficiency and performance
- Documentation & Best Practices: Develop and maintain comprehensive documentation for data pipelines, models, and architecture. Promote best practices in data engineering and drive continuous improvements in processes and technology
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field
- Proven experience as a Data Engineer, with a focus on building data pipelines and solutions using Databricks
- Strong knowledge of Apache Spark and experience in optimizing Spark jobs
- Proficiency with cloud platforms such as Azure, AWS, or Google Cloud, and familiarity with Delta Lake or data lakehouse architectures
- Experience with data modeling, performance tuning, and SQL
- Hands-on experience with Python or Scala for data processing and workflow automation
- Familiarity with data orchestration tools such as Airflow, Databricks Workflows, or similar
- Knowledge of data governance, security, and compliance best practices in the cloud
- Experience with CI/CD pipelines and DevOps for data engineering
Only shortlisted candidates will be contacted by KPMG Talent Acquisition team, personal data collected will be used for recruitment purposes only.
At KPMG in Singapore we are committed to creating a diverse and inclusive workplace. We believe that diversity of thought, background and experience strengthens relationships and delivers meaningful benefits to our people, our clients and communities. As an equal opportunity employer, all qualified applicants will receive consideration for employment regardless of age, race, gender identity or expression, colour, marital status, religion, sexual orientation, disability, or other non-merit factors. We celebrate the different talents that our people bring and support every staff member in their journey to achieve personal and professional growth. One of the ways we do this is through Take Charge: Flexi-work, our flexible working framework which enables agile and innovative teams to help deliver our business goals.