A Databricks Engineer is responsible for migrating from as-is on-perm data warehouse to Databricks and sound
knowledge of Databricks. Requires good understanding of data architecture, cloud solutions, building and maintaining
a robust, integrated, and governed data infrastructure. The role involves extracting valuable insights from data while
ensuring data security, compliance, and high-quality data management.
Roles And Responsibilities:
• Lead end to end data migration project from on-premises environments to Databricks with minimal downtime.
• Work with architects and lead solution design to meet functional and non-functional requirements.
• Hands on experience in Databricks to design and implement the solution on AWS.
• Hands on experience in configuring Databricks clusters, writing Pyspark codes, build CI/CD pipelines for the
deployments.
• Highly experienced in optimization techniques (Zordering, Auto Compaction, vacuuming)
• Process near real time data through Auto Loader, DLT pipelines
• Must have strong background in python and able to identify, communicate and mitigate risks and issues.
• Identify and resolve data-related issues and provide support to ensure data availability and integrity.
• Optimize AWS, Databricks resource usage to control costs while meeting performance and scalability
requirements.
• Stay up to date with AWS, Databricks services, and data engineering best practices to recommend and
implement new technologies and techniques.
• Proactively implement engineering methodologies, standards, and leading practices.
Requirements / Qualifications
• Bachelor’s or master’s degree in computer science, data engineering, or a related field.
• Minimum 5 years of experience in data engineering, with expertise in AWS or Azure services, Databricks, and/or
Informatica IDMC.
• Proficiency in programming languages such as Python, Java, or Scala for building data pipelines.
• Evaluate potential technical solutions and make recommendations to resolve data issues especially on
performance assessment for complex data transformations and long running data processes.
• Strong knowledge of SQL and NoSQL databases.
• Familiarity with data modelling and schema design.
• Excellent problem-solving and analytical skills.
• Strong communication and collaboration skills.
• Databricks certifications, and Informatica certifications are a plus.
Preferred Skills:
• Experience with big data technologies like Apache Spark and Hadoop on Databricks.
• Experience in AWS Services focusing on data and architecting.
• Knowledge of containerization and orchestration tools like Docker and Kubernetes.
• Familiarity with data visualization tools like Tableau or Power BI.
• Understanding of DevOps principles for managing and deploying data pipelines.
• Experience with version control systems (e.g., Git) and CI/CD pipelines.
• Knowledge of data governance and data cataloguing tools, especially Informatica IDMC.