Responsibilities:
- Develop tools to improve data flows between internal/external systems and the data lake/warehouse.
- Work with stakeholders to understand needs for data structure, availability, scalability, and accessibility.
- Build robust and reproducible data ingest pipelines to collect, clean, harmonize, merge, and consolidate data sources.
- Understanding existing data applications and infrastructure architecture
- Build and support new data feeds for various Data Management layers and Data Lakes
- Support migration of existing data transformation jobs in Oracle, and MS-SQL to Snowflake.
- Lead the migration of the existing data transformation jobs in Oracle, Hive, Impala into Spark, Python on Glue etc.
- Develop and maintain datasets and improve data quality and efficiency.
- Lead Business requirements and deliver accordingly.
- Collaborate with Data Scientists, Architect and Team on several Data Analytics projects.
- Collaborate with DevOps Engineer to improve system deployment and monitoring
- process.
Required Qualifications:
- Bachelor’s Degree in Computer Science or in related majors.
- At least 5+ years of experience in data warehousing using RDBMS and Non-RDBMS databases
- At least 3+ years of experience in handling support & production issues.
- Professional experience working in an agile, dynamic and customer facing environment.
- Must understand distributed systems and cloud technologies (AWS).
- Good to have experience with large scale datasets, data lake and data warehouse technologies such as AWS Redshift, Google BigQuery, Snowflake etc.
- Must have experience in ETL (AWS Glue), Amazon S3, Amazon RDS, Amazon Kinesis, Amazon Lambda, Apache Airflows, Amazon Step Functions.
- Experience with full SDLC lifecycle and Lean or Agile development methodologies.
- Knowledge of CI/CD and GIT Deployments.
Interested Applicants, please email your resume to [email protected] (R1441955), stating the position as the subject title in the email. All Applications will be handled with strict confidentiality.