Get to know the Role:
As the Senior Data Engineer in the Data Technology team, you will be working on all aspects of Data, from Platform and Infra build out to pipeline engineering and writing tooling/services for augmenting and fronting the core platform. You will be responsible for building and maintaining the state-of-the-art data Life Cycle management platform, including acquisition, storage, processing and consumption channels. The team works closely with Data scientists, Product Managers, Finance, Legal, Compliance and business stakeholders across the SEA in understanding and tailoring the offerings to their needs. As a member of the Data Tech team, you will be an early adopter and contributor to various open source big data technologies and you are encouraged to think out of the box and have fun exploring the latest patterns and designs in the fields of Software and Data Engineering.
The day-to-day activities: Build and manage the data asset using some of the most scalable and resilient open source big data technologies like Airflow, Spark, Snowflake, Kafka, Kubernetes, ElasticSearch, Superset and more on cloud infrastructure.
- Design and deliver the next-gen data lifecycle management suite of tools/frameworks, including ingestion and consumption on the top of the data lake to support real-time, API-based and serverless use-cases, along with batch (mini/micro) as relevant
- Build and expose metadata catalog for the Data Lake for easy exploration, profiling as well as lineage requirements
- Enable Data Science teams to test and productionize various ML models, including propensity, risk and fraud models to better understand, serve and protect our customers. Lead technical discussions across the organization through collaboration, including running RFC and architecture review sessions, tech talks on new technologies as well as retrospectives
- Apply core software engineering and design concepts in creating operational as well as strategic technical roadmaps for business problems that are vague/not fully understood. Obsess security by ensuring all the components, from a platform, frameworks to the applications are fully secure and are compliant by the group’s infosec policies.
The must haves:
- At least 2+ years of relevant experience in developing scalable, secured, distributed, fault tolerant, resilient & mission-critical Big Data platforms.
- Able to maintain and monitor the ecosystem with 99.99% availability
- Candidates will be aligned appropriately within the organization depending on experience and depth of knowledge.
- Must have good fundamental hands-on knowledge of Linux and building a big data stack on top of AWS using Kubernetes.
- Proficiency in at least one of the programming languages Python, Scala or Java.
- Strong understanding of big data and related technologies like Spark, Airflow, Kafka etc.
- Experience with NoSQL databases – KV, Document and Graph
- Able to drive devops best practices like CI/CD, containerization, blue-green deployments, 12-factor apps, secrets management etc in the Data ecosystem.
- Good understanding of Machine Learning models and efficiently supporting them is a plus.