Responsibilities
To be responsible for reliability, availability, user experience, capacity planning, toil reduction, process enhancement and digitalization of the cloud-based internet services.
Handle SRE role for assigned cloud services owning the KPIs for reliability, issue to resolution, service deployment, business continuity management, security policy planning, capacity planning, toil reduction through automation.
Introduce service governance initiatives based on latest technologies to consistently increase reliability and user experience components of Huawei mobile services on cloud to provide world class user experience with high reliability.
Effectively utilize our world class AIOPS and autonomous service governance platform to ideate new ways to streamline process, accuracy of alerts, time series-based trend analysis, anomaly detection, risk identifications.
Support platform/service expansions, migrations to new architectures, upgrades and drill activities across different technology domains.
Incorporate mature chaos engineering for risk identification, IPDRR for security, comprehensive automation frameworks to reduce ops effort to reach lowest possible level and make time, space for engineering related focus for the team.
Requirements and Qualifications
Bachelor/Master of computer science engineering or related majors
Have knowledge of Linux, Network, Database, Containers, Container management systems, etc.
Have knowledge of at least one programming language or scripting such as Java, Python, Shell, Ansible, Terraform
Have knowledge in big data analytics.
Explored new technology trends, opensource technologies, methodologies in internet service domain.
Next Step
Click “apply” or send resume to: Ryce [email protected]
EA Licence No.91C2918| Personnel Registration No. R23117258