Responsibilities
About TikTok
TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul, and Tokyo.
Why Join Us
Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.
Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day.
To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always.
At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve.
Join us.
Team Introduction
The Site Reliability Engineering (SRE) team is a fusion of software and systems engineering techniques used to design and operate large-scale, extensively distributed, and resilient systems. Within Infrastructure SRE at TikTok, our primary focus is to ensure that the reliability and uptime of our infrastructure services meet the needs of our users and support rapid improvement iterations. Our software development efforts are deeply committed to optimising existing systems, constructing essential infrastructure, and streamlining operations through automation. In the SRE team, you will have the opportunity to manage the complex challenges of scale, while using expertise in coding, algorithms, complexity analysis, and large-scale system design. We embrace a culture of diversity, intellectual curiosity, openness, and problem-solving. We also encourage ownership, self-governance and independence to work on various projects and an environment that provides the support and mentorship needed to learn and grow as an engineer.
The Role
In the role of a Tech Lead, you will assume responsibility for guiding and assembling a team of software and system engineers, leveraging your exceptional technical leadership skills. Your role will involve establishing efficient processes for project execution and promoting sound engineering practices. Additionally, you will maintain regular coordination and communication with other infrastructure teams and our user community.
What you will be doing:
1. Establish and oversee the SRE team, which encompasses tasks such as team recruitment, the training of new talent, system operation and maintenance, coordination efforts, and fostering a cohesive team culture;
2. Oversee the acquisition and development of software systems in organisational units. Establish a comprehensive long-term technical strategy with well-defined implementation steps and milestones to continually enhance the team's competitiveness and technological capabilities;
3. Reliability: Ensuring the reliability and efficiency of our core infrastructure, focusing on system capacity and stability; setting up reliability standards and recovery SOP;
4. Reliability: Troubleshooting and locating technical issues, bottleneck analysis, managing system high availability architecture transformation and upgrading;
5. Efficiency: Building automated operation solutions for large-scale systems; partnering with system development teams for system iteration;
6. Efficiency: Designing and implementing software platforms and monitoring frameworks for efficient, automated, and intelligent service-oriented architecture (SOA) governance;
7. Cost: There are millions of CPUs. We should build delivery standards, and monitor and budget systems to optimize the cost of the company;
8. Compliance: Designing and setting up new IDC; designing and implementing a data protection plan to meet the standard requirement.
Minimum Qualifications:
- Solid basic knowledge of computer software, understanding of Linux operating system, storage, network IO and other related principles;
- Familiar with one or more programming languages, such as Python, Go, and Java. Knowledge of design patterns and coding principles is necessary.
Preferred Qualifications:
- Bachelor's / Master's Degree in Computer Science or related major;
- Minimum of 3 years experience with the following NoSQL systems: Redis, MongoDB and other KV systems;
- Minimum of 3 years experience with the following SQL systems: MySQL, PostgreSQL.
TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.