Job Responsibilities:
- Design and implement scalable, efficient application architectures based on LLMs for applications.
- Leverage RAG, vector databases, and prompt optimization techniques to build low-latency, high-concurrency model services tailored to internet applications.
- Work closely with product managers, NLP engineers, and data science teams to determine the optimal deployment strategies for LLMs in various business scenarios.
- Utilize the ReAct framework to support multi-step reasoning and complex task handling, enhancing the intelligence of user interactions.
- Establish workflows that integrate LLMs with user behavior data and knowledge bases to improve accuracy in recommendations, chat, and assistive functions.
- Lead and manage a technical team, including model engineers, NLP specialists, and data engineers, to ensure projects progress on schedule and meet objectives.
Requirements:
- Bachelor’s degree or higher in computer science, natural language processing, artificial intelligence, or a related field; a master’s or Ph.D. is preferred.
- Minimum of 5 years in AI/NLP application development and production deployment.
- Proven experience in companion mobile apps, with successful application implementation in the internet industry.
- Familiarity with mainstream LLMs, such as OpenAI, with experience in large-scale application deployment and optimization.
- At least 3 years of experience in technical team management, with a track record of successful team building and project management.
- Experience in cloud environments and containerized deployment, with expertise in designing and optimizing application architectures for high concurrency and low latency.
- Proficiency in Python, Go, and other mainstream programming languages, with hands-on experience in CI/CD and LLMOps.
- Strong communication and cross-team collaboration skills to drive effective team coordination.
- Excellent leadership skills, with the ability to motivate teams, set clear objectives, and drive efficient execution.
- High degree of creativity and problem-solving skills, adaptable to the rapid changes and iterations in internet applications.