Roles & Responsibilities:
• Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services in combination with 3rd parties – Spark/Python on Glue, DMS, S3, Athena, RDS-PostgreSQL, Airflow, Lambda, Code Commit, Code Pipeline, Code build, etc.
• Design and build production ETL data pipelines from ingestion to consumption within a big data architecture, using DMS, DataSync & Glue.
• Understand the existing applications(including on-premise Cloudera Datalake) and infrastructure architecture.
• Analyze, re-architect and re-platform on-premise data warehouses to data platforms on AWS cloud using AWS or 3rd party services.
• Design and implement data engineering, ingestion and curation functions on AWS cloud using AWS native or custom programming.
• Perform detail assessments of current state data platforms and create an appropriate transition path to AWS cloud.
• Collaborate with development, infrastructure, and data center teams to define Continuous Integration and Continuous Delivery processes in accordance with industry standards.
• Work on hybrid Data lake.
• Work closely with multiple stakeholders to ensure high standards are maintained.
Mandatory Skill-set
• Bachelor's Degree in Computer Science, Information Technology, or other relevant fields
• 5+ years of work experience with ETL, Data Modelling, and Data Architecture to build Data lake. Proficient in ETL optimization, designing, coding, and tuning big data processes using Pyspark.
• 3+ years of extensive experience in working on AWS platform using core services like AWS Athena, Glue Pyspark, RDS-PostgreSQL, S3 & Airflow(for orchestration).
• Fundamental of Insurance domain
• Functional knowledge on IFRS17