1. Responsible for the coordination and management of the operation and maintenance team, including network/hardware planning management, basic operation and maintenance, database operation and maintenance, application operation and maintenance, operation and maintenance platform, etc.;
2. Establish and improve standardized operation and maintenance systems, processes, management strategies and security strategies to ensure the quality of operation and maintenance;
3. Responsible for the deployment and daily maintenance and management of the company's operation and maintenance systems to ensure the normal operation and emergency response of the operation system;
4. Ensure the safety of operation and maintenance of the department, handle operation and maintenance accidents, optimize various maintenance workflows, and continuously reduce system risks;
5. Responsible for the management of the operation and maintenance team, the improvement of employees' work skills, etc.
6. Very important: have strong English communication skills
1. Bachelor degree or above, major in science and engineering, more than 5 years of operation and maintenance work experience, and more than 3 years of operation and maintenance management experience in Internet companies;
2. Proficient in Linux system management, proficient in using at least one scripting language such as shell, perl, python;
3. Familiar with rundeck, k8s, chef, vmware, pagerduty, ansible
4. Have practical experience in the following technologies: including but not limited to Serverless, GitOps, Infrastructure As Code, Helm, Jenkins Pipeline, etc.
5. Familiar with various public clouds, focusing on AWS, Alibaba Cloud, and GCP, and can manage infrastructure through terraform
6. Familiar with the management, optimization and disaster recovery backup of mysql, mongoDB and redis;
7. Ability to analyze and improve large-scale application system architecture, and ability to quickly troubleshoot system bottlenecks;
8. Familiar with Internet platform network architecture planning and design, content backup, release mechanism, and network monitoring system;
9. Familiar with the selection, construction, configuration, monitoring, performance optimization and maintenance of servers and storage systems;
10. Possess successful experience in the implementation of Internet platform operation and maintenance specifications, and have strong operation and maintenance team management capabilities;