Job Role
• Design, develop and deploy data tables, views and marts in data warehouses, operational data store, data lake and data virtualization.
• Perform data extraction, cleaning, transformation, and flow. Web scraping may be also a part of the work scope in data extraction.
• Design, build, launch and maintain efficient and reliable large-scale batch and real-time data pipelines with data processing frameworks
• Integrate and collate data silos in a manner which is both scalable and compliant
• Collaborate with Project Manager, Data Architect, Business Analysts, Frontend Developers, UX Designers and Data Analyst to build scalable data-driven products
• Be responsible for developing backend APIs & working on databases to support the applications
• Work in an Agile Environment that practices Continuous Integration and Delivery
• Work closely with fellow developers through pair programming and code review process
Experience & Skillset
• Experience in general data cleaning and transformation (e.g. SQL, pandas, R, etc)
• Experience in building ETL pipeline (eg. SQL Server Integration Services (SSIS), AWS Database Migration Services (DMS), Python, AWS Lambda, ECS Container task, Eventbridge, AWS Glue, Spring)
• Experience in database design and various databases (e.g. SQL, PostgreSQL, AWS S3, Athena, mongodb, postgres/gis, mysql, sqlite, voltdb, cassandra, etc)
• Experience in cloud technologies such as GPC, GCC (i.e. AWS, Azure, Google Cloud)
• Experience in utilizing AWS tools and services such as Glue, DMS and Redshift
• Experience and passion for data engineering in a big data environment using Cloud platforms such as GPC, GCC (i.e. AWS, Azure, Google Cloud)
• Experience with building production-grade data pipelines, ETL/ELT data integration
• Knowledge about system design, data structure and algorithms
• Familiar with data modelling, data access, and data storage infrastructure like Data Mart, Data Lake, Data Virtualisation and Data Warehouse.
• Familiar with rest api and web requests/protocols in general
• Familiar with big data frameworks and tools (eg. Hadoop, Spark, Kafka,RabbitMQ)
• Comfortable in at least one scripting language (eg. SQL,Python)
• Comfortable in both windows and linux development environmentsInterest in being the bridge between engineering and analytics.