The Systems Engineer (Network) reports to the VP, Technology (Infrastructure).
The Senior Systems Engineer (Network) will be responsible for the planning, project management and deployment, administration and operations support for network to ensure that systems, services and applications are performing optimally to meet customer, user, and company expectations.
This technical role covers the following work areas:
(a) Technical design, administration, implementation, troubleshoot and management of Network and Data Centre infrastructure
(b) 24x7 Operations Support including scheduled on-call duties
(c) Vendor management, project management for network related projects and provide support to other Technology projects
(d) Documentation including RFQ/RFP technical specifications/diagrams, operations manuals/guidelines
(e) Backup role for Data Centre Facilities Management services including audit support and/or additional tasks assigned by supervisor.
The primary work area is network and secondary work area is data centre facility management.
Planning
- Key Technology roadmap and planning
- Proactive Capacity Management & Planning
o Perform growth projections on network infrastructure resources.
o Proactive capacity and usage monitoring and plan for the required upgrade (s)
o Plan for and perform capacity upgrades by scaling up of resources or scaling out in terms of addition of additional component upgrades
- Network Security hardening
- Proactive Availability Management and Planning
o Manage and maintain the network infrastructure and the data centres to meet business availability requirements
o Perform availability review on network infrastructure resources
o Identify and provide recommendations to achieve required improvements to meet business required availability requirements
o Implement approved availability enhancements
- Plan and consolidate all change activities that require downtime where possible, in order to reduce the disruption to production service availability
- Plan hardware/software/firmware update, refresh or replacement to ensure they remain under vendor/manufacturer supported version timely
- Support systems security hardening reviews
- High Availability (HA)/Failover and Backup/Recovery configuration and setup to meet required Recover Time Objective (RTO)/Recovery Point Objective (RPO)
- DRP/BCP (Disaster Recovery Plan and Business Continuity Plan) procedure build, update, verify and actual procedure execution
System and Service Administration
- Perform monitoring health status of multiple environments 24*7
- Perform access control management
- Identifying and documenting dependencies and interfaces between network, systems, applications and databases to ensure the applications and associated Service Level Agreements (SLA) can be met
- Support periodic audit review exercises by providing the required information and to perform the remediation actions to close all audit findings
- Inventory, Assets and License tracking and updates to ensure compliance
- Maintaining technical documentation for configurations/setups
- Ensure the data integrity, configurations and data/applications synchronization of DEV/UAT/Production/DR or other established environments
- Ensuring the various systems interfaces, system flow and data exchanges between components and servers are functioning
- Ensure data replication (Application/database or other products used in company) in sync for the UAT/Production/DR databases
- Preparation and setup of test environments for applications (setup/patching/upgrade/enhancements or new projects)
- Ensure proper backup and recovery of system (based on backup/recovery schedule)
- Ensure proper logs and archival and manage the tape restoration test successfully at least once every 6 months.
- Perform scheduled periodic BCP/DRP exercises (switch over and switch back and testing)
Operations Support
- Perform daily operational tasks, health checks and provisioning as required.
- Providing 24x7 on-call standby duties (average 1 or 2 weeks per month)
- Perform after-hours (on-call) emergency work as scheduled or required.
- Investigate issues within 30 min upon alert/activation for incidents
o Observe the daily behaviour trend and effectively take proactive actions to remediate issues and incidents
o Perform proactive incident prevention and remediation
- Incident Management - effective and timely participation, management and resolution
- Problem Management
o Effective and timely participation to resolve root cause and provide recommendation (s) to solve technical problems
o Perform necessary code changes/enhancement deployment release tasks required into UAT environments
o Perform post implementation review of release tasks and update completion status into ITSM tool
- Change Management
- o To comply with the Change Release Procedure for Production, DR and UAT environments, as stated in the company IT Change Management System Guideline
o Perform configuration changes (Production/DR environments)
- Configuration Management System
o Configuration item updates
- Perform periodic/scheduled maintenance work
- Perform/manage hardware (firmware)/Software version upgrade, patch fixes and security updates feasibility study and recommendation (inclusive of testing, change management, plan rollout and implementation)
- Operations Enhancement and Efficiency
o Proactive review of current operational tasks, processes and procedures (inclusive of BCP and DR). Identify areas of improvement or to streamline mundane tasks for the purpose of simplification and automation to improve operational efficiencies.
o Review and propose new or better ways to monitor and perform health checks of current environment to ensure availability and effective/efficient issue resolution
o Perform documentation review and updates to ensure processes and procedures are maintained and current
Project Management and Vendor Management
- Providing/reviewing solutions and options in terms of systems architecture and configuration
- Lead infrastructure related projects (E.g. server refresh, new infrastructure services) when required
- Participate as project members for key application and/or infrastructure projects
- Provide Application Infrastructure support to Application team and various internal teams during solution deployments
- Manage vendors/contractors to ensure implementation is in accordance to stipulated requirements and service/ system SLA is met
- Provide the infrastructure level requirements (E.g. system licenses, software and versions, etc.)
- Identify and assess risks, provide/review risk mitigating measures associated with the technical solutions and the migration plan
- Planning the infrastructure-level capacity (CPUs, memory, disk) requirement and to leverage on existing setup if possible
- Providing migration plan from existing setup to new proposed solution (if required)
- Provide the 24x7 operation support plan after the project implementation transitions into operational phase
Requirement:
- Minimum of 5 years’ experience in supporting, maintaining and implementing Network and/or Data Centre Facilities Management (Cisco/Alcatel/Juniper Inter-Networking Solutions) technical administration and support.
- Excellent understanding of common network technologies, including TCP/IP, IPsec, VRRP, HSRP, SNMP, NAT, Multicast, Sub-netting, Ethernet, Access-lists.
- In-Depth Working Knowledge and understanding of the following routing protocols: BGP, OSPF, EIGRP, ISIS, MPLS.
- Familiar with the setup and configuration of network security devices such as firewalls, wan optimisers, virtual private networks and authentication servers etc.
- Proven working knowledge, implementation and administration experience of high availability Network Security devices (E.g. Checkpoint, Cisco Firepower, SRX, Palo)
- Proven knowledge of DNS, time synchronisation, Proxy, F5 Load Balancers, Infoblox
- Proven knowledge in switch and router configuration, including storage switch configurations
- Monitor & measure the performance & availability of network devices proactively; implement corrective actions identified to improve performance & service level availability and capacity planning.
- Experience on the various Network Management platforms (E.g. MRTG, Cisco ISE, Algosec etc.)
- Minimum of 5 years’ experience in managing vendors in terms of procurement, writing and managing RFQ/RFP, project management, problem resolutions.
- Managed network projects successfully with proven track records.
- Working knowledge of DRP (disaster recovery plan) and BCP (business continuity plan)
- Degree in IT or engineering or equivalent.
- Minimally possess technical network certification (CCNP-Routing and Switching and Data Centre) or higher