- Design, develop, and maintain scalable data pipelines and ETL processes using Hadoop and Spark.
- Write efficient and optimized data processing jobs in Scala to transform and load data.
- Utilize SQL for querying and manipulating data across various databases.
- Collaborate with data scientists and analysts to gather requirements and deliver data solutions that meet business needs.
- Monitor and troubleshoot data processing workflows to ensure high performance and reliability.
- Implement best practices for data quality, governance, and security.
- Document data engineering processes, workflows, and architecture for team knowledge sharing.
- Stay updated on industry trends and emerging technologies to continuously improve our data strategies.
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- 3+ years of experience as a Data Engineer or in a similar role.
- Strong proficiency in Hadoop ecosystem components (e.g., HDFS, Hive, Pig).
- Hands-on experience with Apache Spark for big data processing.
- Proficient in Scala programming for data transformation tasks.
- Advanced SQL skills for querying and managing large datasets.
- Experience with data modeling, data warehousing, and ETL processes.
- Familiarity with cloud platforms (AWS, GCP, or Azure) is a plus.
- Strong analytical and problem-solving skills.