- Design, develop, and maintain scalable data pipelines using Spark and Scala to process, transform, and analyze large volumes of data.
- Collaborate with cross-functional teams to understand data requirements and implement solutions that meet business needs.
- Optimize and tune existing data pipelines for performance, reliability, and scalability.
- Ensure data quality and integrity through validation, testing, and troubleshooting of data processes.
- Stay updated with industry best practices and emerging technologies in big data and distributed computing.
- Bachelor's/Master’s degree in Computer Science, Engineering, or related field.
- Proven experience as a Data Engineer with expertise in Spark and Scala.
- Strong proficiency in building and optimizing big data pipelines and workflows.
- Hands-on experience with distributed computing frameworks (Spark, Hadoop, etc.) and related tools.
- Solid understanding of database technologies (SQL, NoSQL) and data warehousing concepts.
- Excellent problem-solving skills with a proactive attitude towards learning and adapting to new technologies.