Responsibilities:
1) Implement and maintain highly resilient, highly available data engineering, monitoring and analytics application clusters. Perform production support for the platform
2) Setup the server infrastructure as per design. Ensure implementation meets bank’s security standards and industry’s security standards
3) Perform continuous improvement for the platform covering areas such as: capacity planning, observability, monitoring, reliability, and resiliency
4) Design and develop data engineering pipelines
5) Automate repetitive tasks, optimize processes and perform thorough testing to ensure quality
6) Create and maintain software documentation for the platform
7) Perform application maintenance, patching and upgrades
8) Develop simple dashboard applications
Deliverables:
1) Ensure on-time delivery of tasks and projects
2) Ensure continuous uptime of applications and services
3) Ensure no security or audit issues
Job Dimensions:
1) Comply to bank standards to track and follow up on the assigned projects
2) Cover all areas in application and infrastructure operations of the platform
Requirements:
1) University graduate (computer science or related field) with good experience working with contemporary technologies and scripting languages
2) Strong communication skills and ability to explain protocol and processes with team and management
3) A passion for learning and using new technologies in the open source communities
4) A passion for coding.
Functional / Technical Competencies:
1) Looking for 4 or more years of IT work experience
2) Good knowledge and hands on experience in Grafana, Elastic stack (Elasticsearch / Logstash / Kibana / Beats) and Kafka including set up, configuration, upgrade, patching, data ingestion, management, monitoring & analytics features, developing dashboard
3) Good knowledge and experience in Linux Shell and Python scripting
4) Self-driven, committed, and reliable team player. Passionate to learn new technologies
5) Additional knowledge and experience for following areas below is not mandatory but would be a plus
6) SRE (Site Reliability Engineering) practices covering monitoring, observability, performance management, automation, and resiliency
7) Object Oriented Programming, web application development, NodeJS, Spring boot and Kafka
8) Automation tools (e.g. Ansible, Chef, Puppet etc.) & DevOps pipelines