Responsibilities
TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.
At TikTok, our people are humble, intelligent, compassionate and creative. We create to inspire - for you, for us, and for more than 1 billion users on our platform. We lead with curiosity and aim for the highest, never shying away from taking calculated risks and embracing ambiguity as it comes. Here, the opportunities are limitless for those who dare to pursue bold ideas that exist just beyond the boundary of possibility. Join us and make impact happen with a career at TikTok.
This position is with TikTok's Stability Assurance Team. The team is responsible for ensuring that the services provided by TikTok are highly reliable with low-latency. Reliability assurance is complex and systematic, for any massive application system. We focus on optimizing the application architecture from end to end, driven by data analysis and aim at automatic and intelligent failure recovery.
In this role you are:
- Responsible for the optimization of TikTok's core function architecture, designing a highly available, high performance, and highly maintainable system to ensure the stability of core functions;
- Design and achieve the guarantee system of end-to-end full-link stability for the core functions
This role will allow you to:
- Optimize the quality and process of the entire R&D process, conduct in-depth research on code quality improvement, automatic testing, and automatic deployment.
- Optimize the architecture of TikTok, improve the architecture for better reliability, better scalability and lower latency, design automatic disaster recovery solutions.
- Build a massive service governance system with the ability to visualize the entire system architecture, and locate and resolve faults automatically. Help develop an automated chaos system
- Build a stability measurement system, systematically measure the system's ability to prevent, detect, and resolve failures, and provide a one-click solution to solve stability problems
- Become an SRE expert, gain insight into the hidden dangers of the system, establish operation and maintenance standards, improve the automaticity of operation and maintenance with the R&D team, and improve the stability guarantee system
Qualifications:
- Bachelor's degree in Computer Science or a related technical background involving software/system engineering, or equivalent working experience.
- Good programming experience with high concurrency/complex business system/service management
- Proficient in at least one of the following backend languages: C/C++/Java/Go/Python/Shell/PHP
- Positive and optimistic, strong sense of responsibility, self-driven, serious, good team communication and collaboration skills
Preferred Qualification
- Minimum 5 years relevant work experience from a large-scale internet business
TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.