YOUR ROLE:
The Senior Data Engineer will work on acquiring, storing, governing and processing huge sets of structured / unstructured data. You will bring a vision of the big data solutions landscape for choosing optimal architecture components to use for these purposes as well as a set of data engineering skills for implementing enterprise data foundations such as data lake. Finally, you will work closely with other company experts in Data Intelligence, Research, UX Design, Digital Technology and Agile teams to enable intelligence in the digital mesh of our clients and deliver big impact to them.
YOUR RESPONSIBILITIES:
- Design, implement, and maintain robust and scalable data pipelines to ingest, transform, and process structured and unstructured data from various sources.
- Build and manage data warehouses and data lakes, implementing efficient data storage and retrieval mechanisms. Design data models that support business requirements and analytics use cases.
- Leverage cloud platforms (e.g., AWS, Azure, GCP) to design and deploy scalable data infrastructure, optimizing for performance, cost-effectiveness, and reliability.
- Implement data quality checks, validation processes, and data governance frameworks to ensure data accuracy, integrity, and compliance with security standards.
- Continuously monitor and optimize data pipelines and infrastructure to improve processing speed, reduce latency, and enhance overall system performance.
- Work collaboratively with data scientists, analysts, and other stakeholders to understand their data needs, provide technical expertise, and deliver solutions that meet business objectives.
- Support business users by implementing data presentation layer including data visualisation using tools like Tableau, PowerBI etc.
- Stay up-to-date with emerging technologies and industry trends in the big data and data engineering space, evaluating and implementing innovative solutions to improve data processing capabilities.
- Support the definition and optimization of underlying data infrastructure
WHO YOU ARE:
- You hold a Bachelor's, Master's, or Ph.D. degree in IT, Information Management, and/or Computer Science with at least 6 years of experience.
- Good knowledge of the big data technology landscape and concepts related to distributed storage/computing
- Experience with big data frameworks (e.g. Hadoop, Spark) and distributions (Cloudera, Hortonworks, MapR)
- Experience with batch & ETL jobs to ingest and process data from multiple data sources
- Experience with NoSQL databases (e.g. Cassandra, MongoDB, Neo4J, ElasticSearch)
- Experience with querying tools (e.g Hive, Spark SQL, Impala)
- Experience with Power BI
- Experience or willingness to go in real-time stream processing, using solutions such as Kafka, AWS Kinesis, Flume, and/or Spark Streaming
- Experience or willingness to learn about DevOps and DataOps principles (e.g Infrastructure as Code, automating different parts of the data pipeline)
- High-level understanding of Data Science concepts and methodologies (how models are built, trained, and deployed)
- You are passionate about technology and continuous learning comes naturally to you