The candidate is required to research and develop advanced multimodal foundational model, with initial focus on speech and text. The work involves defining and implementing novel deep learning methodologies to achieve state-of-the-art performance for different spoken language tasks. The candidate is also expected to collaborate with other research scientists and engineers to adapt the foundational model to specific use cases. Proficiency in problem-solving, coding, effective communication, and collaborative teamwork is crucial for this role.
Job Requirement:
- PhD in ML, NLP, speech and spoken language processing and multi-modal AI
- Proficient programming skills, familiarity with Linux is a must
- Familiar with different deep learning and LLM training frameworks
- Strong analytical and critical thinking skills, good team player with good communication and interpersonal skills
- Ability to replicate and reproduce state-of-the-art models and results, and then innovate on top of these benchmarks
The above eligibility criteria are not exhaustive. A*STAR may include additional selection criteria based on its prevailing recruitment policies. These policies may be amended from time to time without notice. We regret that only shortlisted candidates will be notified.