Responsibilities:
- Design, develop, and deploy data tables, views, and marts in data warehouses, operational data stores, data lakes, and data virtualization platforms.
- Perform data extraction, cleaning, transformation, and flow, including web scraping as required.
- Build, launch, and maintain efficient, reliable large-scale batch and real-time data pipelines using data processing frameworks.
- Integrate and consolidate data silos in a scalable and compliant manner.
- Collaborate with cross-functional teams, including Project Managers, Data Architects, Business Analysts, and others, to develop scalable data-driven products.
- Develop backend APIs and manage databases to support applications.
- Work closely with fellow developers through pair programming and code reviews.
- Lead the design and implementation of data solutions in a big data environment.
Experience and Skills Required:
- Proficiency in data cleaning and transformation tools (e.g., SQL, pandas, R).
- Expertise in building ETL pipelines using tools such as SQL Server Integration Services (SSIS), AWS Database Migration Services (DMS), Python, AWS Lambda, and ECS Container tasks.
- Strong database design skills and experience with various databases (e.g., SQL, PostgreSQL, MongoDB, MySQL).
- Experience with cloud technologies (AWS, Azure, Google Cloud).
- Familiarity with big data frameworks and tools (e.g., Hadoop, Spark, Kafka).
- Knowledge of system design, data structures, and algorithms.
- Proficiency in scripting languages (e.g., SQL, Python).
- Leadership experience in data engineering projects.