Responsibilities
- Migrate existing data warehouse and data model from MariaDB to Hadoop Big Data Cloudera on-premises platform.
- Develop and optimize SQL, Hive SQL, and Spark scripts to ensure efficient data processing.
- Design and implement data models to support business requirements and optimize performance.
- Collaborate with cross-functional teams to understand data requirements and ensure data integrity throughout the migration process.
- Develop and execute test plans to validate data accuracy and system performance.
- Coordinate with stakeholders (internal) to plan and execute production deployments.
- Schedule and monitor jobs using Autosys to ensure timely execution and minimize downtime.
- Provide technical expertise and guidance to team members throughout the migration project.
- Document processes, procedures, technical specifications, and best practices to facilitate knowledge sharing and ensure scalability.
- Create and maintain unit test case documents to ensure code quality and reliability.
Skills Requirement
- Bachelor’s degree in Computer Science/Information Technology/Engineering, or a related field.
- 7 - 10 years of experienece in relavant field.
- Proven experience of 5+ years in data engineering and migration projects for big data (Hortonworks / Cloudera).
- Strong hands-on experience on Cloudera and related ecosystem components.
- Strong experience in implementing ETL (Extract, Transform, Load) processes.
- Highly proficient in SQL, Hive SQL, Spark, and data modeling.
- Strong understanding of the production deployment process.
- Experience with scheduling jobs using Autosys or similar.
- Experience with version control systems such as Bitbucket, GIT etc.
- Ability to troubleshoot and resolve data related issues efficiently.
- Good communication and interpersonal skills.
- Experience with Dataiku is a plus, but not mandatory.