Responsibilities:
- Design, develop, and deploy data tables, views, and marts in data warehouses, operational data stores, and data virtualization.
- Perform data extraction, cleaning, transformation, and flow. Web scraping may be also a part of the work scope in data extraction.
- Design, build, launch, and maintain efficient and reliable large-scale batch and real-time data pipelines with data processing frameworks
- Integrate and collate data silos in a manner that is both scalable and compliant
- Collaborate with Project Manager, Data Architect, Business Analysts, Frontend Developers, UX Designers, and Data Analyst to build scalable data-driven products
- Be responsible for developing backend APIs & working on databases to support the applications
- Work in an Agile Environment that practices Continuous Integration and Delivery
- Work closely with fellow developers through pair programming and code review process
Requirements:
- Proficient in general data cleaning and transformation (e.g. SQL, pandas, R, etc)
- Proficient in building ETL pipeline (eg. SQL Server Integration Services (SSIS), AWS Database Migration Services (DMS), Python, AWS Lambda, ECS Container task, Eventbridge, AWS Glue, Spring)
- Proficient in database design and various databases (e.g. SQL, PostgreSQL, AWS S3, Athena, mongodb, postgres/gis, mysql, sqlite, voltdb, cassandra, etc)
- Experience in cloud technologies such as GPC, GCC (i.e. AWS, Azure, Google Cloud)
- Experience and passion for data engineering in a big data environment using Cloud platforms such as GPC, GCC (i.e. AWS, Azure, Google Cloud)
- Experience with building production-grade data pipelines, ETL/ELT data integration
- Knowledge of system design, data structure, and algorithms
- Familiar with data modeling, data access, and data storage infrastructure like Data Mart, Data Lake, and Data Warehouse.
- Familiar with rest API and web requests/protocols in general
- Familiar with big data frameworks and tools (eg. Hadoop, Spark, Kafka, RabbitMQ)
- Familiar with W3C Document Object Model and customized web scraping.
- Comfortable in at least one scripting language (eg. SQL, Python)
- Comfortable in both Windows and Linux development environments Interest in being the bridge between engineering and analytics.