Must-Have Skills:
- Programming languages: Python / Spark.
- Experience with Linux utilities & SQL.
- Experience using PySpark for data transformation.
- Knowledge of AWS services (Redshift, Glue, Cloudformation, EC2, S3, Lambda).
Good-to-Have Skills:
- ETL tool experience.
- AWS exposure.
Key Responsibilities:
- Develop ETL pipelines in open-source tools to support data ingestion.
- Write programs to extract data from the data lake and curated data layer.
- Collaborate with different teams to design the ETL pipeline.
- Gather and translate business requirements into scalable, operable solutions.
- Participate in the full development lifecycle, including documentation, delivery, and support.