The ideal candidate should have:
- Strong experience with PySpark, Python, Hive, Hadoop, and the Hadoop ecosystem. They will also be familiar with data modeling, data wrangling, and data analysis.
- Experience with both SAS and Python and will be able to work independently and as part of a team. The position will involve migrating SAS code to Python, as well as developing new Python applications.
- Design, develop, and maintain PySpark and Hive applications.
- Work with data scientists and analysts to build data pipelines and data models.
- Optimize PySpark and Hive queries for performance.
- Troubleshoot and debug PySpark and Hive applications.
- Migrate SAS code to Python.
- Develop new Python applications.
- Document the migrated code.
- Work with other engineers to ensure the smooth transition to Python.
Qualifications :
- Bachelor's degree in computer science, data science, or a related field
- Experience with PySpark, Python, SAS, Hive, Hadoop, and the Hadoop ecosystem
- Experience with data modeling, data wrangling, and data analysis
- Experience with cloud computing platforms (e.g., AWS, Azure, GCP)
- Strong problem-solving and analytical skills
- Excellent communication and teamwork skills
- Strong programming skills
- Experience with data migration
- Ability to work independently and as part of a team