Bank’s IaaS (Infrastructure as a Service) team within Infrastructure Services division is responsible to engineer, build, deploy and run cutting edge solutions in the modern, multi-cloud environment. You’ll have the opportunity to contribute to this high-impact environment, working on innovative projects that drive our business forward. Working in Singapore’s leading financial institution, IaaS SRE engineers manage fast paced, Server environment (Linux, Windows) as well as underlying physical/virtual and Cloud platform hosted within OCBC premises and outside premises i.e. in Public Cloud.
We are seeking a multi-skilled and motivated ‘Windows and Virtualization/Cloud platform Engineer to join IaaS team. In this role, you will be responsible for managing and maintaining our Windows-based infrastructure as well as virtualization/Cloud platforms in multi-cloud environment, ensuring optimal performance, security, and availability. You’ll play a key role in engineering and supporting our critical systems, collaborating with cross-functional teams, and driving continuous improvement in our infrastructure services.
Your primary responsibility will span entire spectrum of designing, building, deploying, and managing the windows server environment and underlying Virtualization/Cloud platform. You will be responsible along with other team members for setting up such platform and infrastructure during project phases, building and deploying technology specific solutions, day-to-day maintenance of physical servers and VMs, Configuration of servers and platform, automation of processes, implementing and enhancing SRE practices to improve environment stability, and responding to production incidents for both on-premises and cloud-based Windows server fleets and underlying virtualisation/Cloud platform.
Roles & Responsibilities
- Engineer, Manage and support Windows Server/Operating System environments (on-premises and off-premises), including installation, configuration, and maintenance.
- Manage, Support, and optimize virtualization/Cloud platforms (e.g., VMware, Azure, AWS) to ensure high availability and performance.
- Perform regular system monitoring, tuning, and capacity planning to maintain system health.
- Review system alerts, incidents, repeat issues and identify, deploy solutions as part of Problem management.
- Embrace, advocate, implement SRE practices by enhancing SLIs, SLOs, release processes and redundant design methodologies. Participate actively in enterprise-wide task force on SRE topics as required.
- Troubleshoot and resolve complex technical issues related to Windows and virtualization environments.
- Implement and manage disaster recovery strategies and backup solutions.
- Collaborate with network, storage, and security teams to ensure seamless integration of various components and software and compliance with industry standards.
- Participate in the development of automation scripts and tools to improve efficiency.
- Participate in Change and On-Call Production escalation Rota as required (Including timings in Application Green Zones, Weekends).
- Vendor liaison in Data Centre (Physically in Singapore and remotely for other sites) for new hardware installs, amends, or decommissioning activities.
- Enhance Server/Platform Provisioning and Decommissioning processes using modern IaC (Infrastructure as Code)
- Participate in Infrastructure Audits, Risk remediation task force as required.
- Document and maintain technical procedures, system configurations, and change management records.
- Review system alerts, incidents, repeat issues and identify, deploy solutions as part of Problem management.
Qualifications - External
Required Skills
- 10+ years of technical experience in relevant field of work is required.
- Must have proven troubleshooting experience and strong understanding of as below.
Must Have Skills -
- Windows Servers Administration & Configuration- Windows 2012,2016,2019 and 2022.
- Hands on in VMware ESXi Configuration, upgrades, and Troubleshooting.
- Experience with VMware virtualization technologies including vSphere, venter management suite, VMware Aria, VMware Cloud Platform.
- Experience in managing/running multi-Cloud platform including VMware, Azure and AWS based IaaS solutions.
- Strong knowledge of Active Directory, DNS, DHCP, and other Windows services.
- Experience with PowerShell, Ansible and Terraform is must for the role.
- Solid understanding of storage technology as well as network protocols, security principles, and firewall configurations.
- In-depth knowledge & hands on in Microsoft Failover Clustering, Windows internals, performance analysis & troubleshooting, patch management
Additional Preferred Skills -
- Experience in managing software defined networking (SDN) and Software Defined Storage (SDS) will be valuable.
- Automation experience in other scripting languages and sound understanding of API based programming.
- Familiarity with Dev-Ops, SRE principles and implementation in the banking environment will be helpful.
- Knowledge on HPE/DELL Blade / Rackmount Systems and configurations will be important.
- Experience in engineering the server images, hardening, and testing of the Windows Server images.
Preferred Certifications
- Microsoft Certified: Windows Server Administrator / higher.
- VMware Certified Professional (VCP) or similar for alternate Hypervisors.
- Microsoft Certified: Azure Administrator (a plus).
- AWS Certified: Cloud Practitioner (a plus).
Soft Skills & Abilities
- Excellent problem-solving skills and the ability to work in a fast-paced environment.
- Strong communication skills and the ability to collaborate effectively with team members and stakeholders.
- Fast learner.