About ByteDance
Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.
Why Join Us
Creation is the core of ByteDance's purpose. Our products are built to help imaginations thrive. This is doubly true of the teams that make our innovations possible. Together, we inspire creativity and enrich life - a mission we aim to achieve every day. To us, every challenge, no matter how ambiguous, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. At ByteDance, we create together and grow together. That's how we drive impact - for ourselves, our company, and the users we serve. Join us.
About the Team
The speech team's mission is to empower content understanding, interaction and creation across TikTok and other products using speech & audio related technologies. We focus on cutting-edge R&D in areas like speech & audio, music processing, natural language understanding and multimodal deep learning. We are looking for top talents to work on these exciting technologies, integrate them into various TikTok and other products and ultimately bring joy to our global user base!
We are looking for talented individuals to join us in 2025. As a graduate, you will get unparalleled opportunities for you to kickstart your career, pursue bold ideas and explore limitless growth opportunities. Co-create a future driven by your inspiration with ByteDance.
Candidates can apply to a maximum of two positions and will be considered for jobs in the order you apply. The application limit is applicable to ByteDance and its affiliates' jobs globally. Applications will be reviewed on a rolling basis - we encourage you to apply early.
Responsibilities
- Conduct cutting-edge research and development in speech/audio foundation models
- Contribute to the advancement of audio understanding, including multilingual speech recognition, speech translation, multimodal understanding and etc.
- Focus on and drive the practical application of relevant technologies in business scenarios, including but not limited to closed-captions, voice dubbing, video understanding.
Qualifications
Minimum Qualifications
- Final year Ph.D or recent Ph.D graduates in Computer Science, engineering quantitative field
- Experience in one or more areas of machine learning and deep learning, including but not limited to:
- Automatic Speech Recognition
- Automatic Speech Translation
- Speech/audio self-supervised learning and foundation models
Preferred Qualifications
- Publications in top-tier ML/DL venues such as NeurIPS, ICLR, ICML, AAAI and speech venues such as ICASSP, ASRU, Interspeech
- Deep understanding of Large Language models
- Familiar with distributed computing and large scale model training
- Familiar with deep learning frameworks such as Tensorflow and Pytorch.
- Familiar with engineering principles and best practices.
- Highly competent in algorithms and programming; Strong coding skills in C/C++ and Python.
- Ability to work collaboratively in a fast-paced, multi-functional environments
ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.