- Perform work in shifts to provide 24/7 on-site or on-call support.
- Incident and Problem management.
- Should have knowledge on SRE Best practices and able to adhere to SRE guidelines in the work.
- Provide root cause analysis techniques to determine cause and resolve complex system issues.
- Perform post-resolution follow-ups to ensure problems have been adequately resolved.
- Communicate application problems and issues to key stakeholders, including management, development teams, end users, and unit leaders.
- Work with onsite and offshore teams across multiple technologies/applications
- Continuous improvement of the system, eq. removal of TOIL, job automation, performance tuning.
- Proactive management of production services by measuring and monitoring availability, latency, throughput, user journeys and overall system health.
Interested candidates kindly submit your updated CV in a Word Format to: [email protected]. Only shortlisted candidates will be notify. Thank you.