Responsibilities
Work on the entire data production by implementing data ingestion pipelines (from multiple sources and in different formats), storage, transformation, then their provision: datamarts, cube, reports, datasets to feed models scoring (data science), API, mainly using Dataiku and Google BigQuery
Ensure that the integration pipelines are designed in a manner consistent with the overall data framework in collaboration with the HQ’s data tech lead and according to best practices/ define framework
Be part of a continuous improvement approach by optimizing and reusing existing assets for streamline.
Take part in data integration processing aspects of data quality controls, monitoring, alerting and technical documentation, as well as data management (data models and mapping, data documentation, repositories, description of the transformations applied, etc.)
Acquire a good understanding and analysis of business challenges and be able to translate the needs into the design of concrete technical solutions and gradually extend the functionalities and scope of our data platform based on Google cloud platform and Dataiku.
Provide reliable estimates of workloads and planning according to the level of complexity and other activities to allow a good coordination of the activities of the team.
Perform unit development tests and support business users in their tests before final validation.
Contribute to the design and management of the data model, as well as to the orientations in terms of the architecture of our data platform (repositories, APIs, etc.)
Set up pipeline monitoring and monitoring of the data platform and APIs (from a functional point of view and data quality)
Analyze incidents, points of weakness and support requests related to the use of the data platform or APIs
Provide timely and accurate support to the business teams on issues
Propose improvements to optimize the data platform (optimization of existing processes, data restructuring, factorization, etc.)
Skills/Requirement
• Mastery of the data stack components in Google Cloud Platform (certifications appreciated) including but not limited to: Google Big Query (nested fields, partitioning, merge SQL, authorized views, RLS), Cloud storage, Cloud functions, Cloud composer, Google Firestore, Google data catalog or similar services of other cloud services.
• Proficiency of Dataiku (on Google big query): development of dataiku flows, implementation of scenarios, scheduling, management of versioning, releases into production, administration etc.
• Mastery of complex SQL queries
• Good knowledge of Python is a plus
• Development practices with data exchange architectures: webservice, API, streaming.
• Development in an agile team and the tools used in CI / CD (Git, Bash, CLI, Azure devops, Jira, Confluence)
• Knowledge of Microsoft Power BI, data catalog tool, data quality, data management
Profile:
• Bachelor’s degree in Management, Computer Science or related field
• Minimum of 3 years’ experience in IT development roles such as data integration on Google Cloud & Dataiku or similar cloud services