Key Responsibilities
- Implement data transformation, aggregation, and enrichment processes to support various data analytics and machine learning initiatives.
- Collaborate with cross-functional teams to understand data requirements and translate them into effective data engineering solutions.
- Ensure data quality and integrity throughout the data processing lifecycle.
- Design and deploy data engineering solutions on OpenShift Container Platform (OCP) using containerization and orchestration techniques.
- Optimize data engineering workflows for containerized deployment and efficient resource utilization.
- Collaborate with DevOps teams to streamline deployment processes, implement CI/CD pipelines, and ensure platform stability.
- Implement data governance practices, data lineage, and metadata management to ensure data accuracy, traceability, and compliance.
- Monitor and optimize data pipeline performance, troubleshoot issues, and implement necessary enhancements.
- Implement monitoring and logging mechanisms to ensure the health, availability, and performance of the data infrastructure.
- Document data engineering processes, workflows, and infrastructure configurations for knowledge sharing and reference.
- Stay updated with emerging technologies, industry trends, and best practices in data engineering and DevOps.
- Provide technical leadership, mentorship, and guidance to junior team members to foster a culture of continuous learning and innovation to the continuous improvement of the analytics capabilities within the bank.
Key Requirements
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- Proven experience as a Data Engineer, working with Hadoop, Spark, and data processing technologies in large-scale environments.
- Strong expertise in designing and developing data infrastructure using Hadoop, Spark, and related tools (HDFS, Hive, Pig, etc).
- Experience with containerization platforms such as OpenShift Container Platform (OCP) and container orchestration using Kubernetes.
- Proficiency in programming languages commonly used in data engineering, such as Spark, Python, Scala, or Java.
- Knowledge of DevOps practices, CI/CD pipelines, and infrastructure automation tools (e.g., Docker, Jenkins, Ansible, BitBucket).
- Experience with Graphana, Prometheus, Splunk will be an added benefit.
- Strong problem-solving and troubleshooting skills with a proactive approach to resolving technical challenges.
- Excellent collaboration and communication skills to work effectively with cross-functional teams.
- Ability to manage multiple priorities, meet deadlines, and deliver high-quality results in a fast-paced environment.
- Experience with cloud platforms (e.g., AWS, Azure, GCP) and their data services is a plus.