What to Expect:
- Accountable for day-to-day operational activities to ensure optimum system performance and determine the system strategy for business continuity.
- Review, implement and uphold the effective implementation of IT policies and operations protocols within the supported platforms.
- Lead an Operations Engineer to work with appointed service providers and stakeholders in delivering operational objectives. This includes data transfers, installation of codes & libraries, tech support to users and vetting of models & algorithms.
- Review and ensuring backup solution for data and system are in place.
- Responsible for ensuring data pipeline in operational state and ensuring data integrity and accessibility.
- Manage application/platform and security incidents, work with various internal teams and vendors to resolve issues on a timely basis to meet SLA, escalating to higher management if necessary. Reporting of incidents, short and long term incident resolution plans at appropriate forums.
- Ensure that Standard Operation Processes (SOPs) are properly documented and complies with audit requirements.
As Technical Support Manager, you need to bring to the team:
- Degree in Computer Science/Engineering, Information Technology, or in relevant disciplines.
- At least 6 years of working experience in Cloud-based services, IT operations and vendor management.
- Formal AWS Certification - AWS Certified SysOps Administrator – Associate.
- Proactive and dedicated individual with strong leadership, and multi-tasking capabilities.
- Ability to build and maintain relationships with a wide array of people at both junior and senior levels.
- Experience in data analytics systems will be highly regarded.
- Experience in running incident, problem, and change management processes.
Candidates with work experience on any of the following will be considered favourably:
- Familiarity and/or related experience in ITIL framework. – incident, problem and change management, service transition.
- Familiarity with security and access control measures to control privileged access to test and production environment.
- Experience in networking technologies such as WAN, LAN, Network Security, Firewall rules, Load Balancers, VPNs and DNS.
- Knowledge of disaster recovery, system backup and restoration.
- Experience with cloud-based services (e.g. AWS, Azure) and project management tools (e.g. Atlassian, JIRA).
- Experience with setting up and or running operations for research projects, and machine learning platforms will be an advantage.
- Worked for Government agency before.