x
Get our mobile app
Fast & easy access to Jobstore
Use App
Congratulations!
You just received a job recommendation!
check it out now
Browse Jobs
Companies
Campus Hiring
Download App
Jobs in Singapore   »   Jobs in Singapore   »   Information Technology Job   »   Site Reliability Engineer, Compute Platform
 banner picture 1  banner picture 2  banner picture 3

Site Reliability Engineer, Compute Platform

Tiktok Pte. Ltd.

Tiktok Pte. Ltd. company logo

About Us


TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul, and Tokyo.


Why Join Us

Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible. Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day. To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve. Join us.


About the team

TikTok and affiliate are developing the next-generation high-performance analytical database, with a mission to enable efficient and real-time data-driven decision-making on PB-level data sets. The initial product was forked from Clickhouse, after which large re-architecture had been taken place. The product now not only improves the efficiency of Clickhouse but also fits into the elastic cloud-native infrastructure with better scalability and resource utilization. With years of polishment in the internal EB-level scenarios, we are now ready to serve our business partners via various cloud vendors.


What you will be doing:

- Responsible for the real-time business stability system construction of ByteDance data platform , and promote the stability and service quality improvement of real-time Big data products;

- Responsible for ensuring the stability of Flink and data streams, and implementing them from the perspective of problem improvement and governance. At the same time, work closely with the Product Research & Development team to improve the efficiency of fault hemostasis.

- Responsible for the automation tool capacity building of the real-time platform , from standardization precipitation to tooling, to improve the ability of problem discovery and rapid hemostasis.



Minimum Qualifications:

- Computer-related major, full-time bachelor's degree or above, 2 years or more experience in SRE operation and maintenance in the real-time field of Big data;

- Familiar with the architecture and principles, operation and maintenance, and stability construction of real-time computing product components. Product components include: Flink , Kafka, Hadoop, Spark, Kafka, etc.

- Familiar with at least one programming language, including but not limited to: Shell, Python, Java, Golang, etc.

- Have good communication skills, teamwork and self-motivation to promote cross-team cooperation.


Preferred Qualifications:

- 5 years or more experience in SRE operation and maintenance in the real-time field of Big data; - Have practical experience in troubleshooting and handling Big data product issues, and have the ability to quickly troubleshoot and locate problems when facing online Big data product issues.


TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

Sharing is Caring

Know others who would be interested in this job?