Descriptions:
- Drive the implementation and refinement of distributed training strategies across multi-GPU and multi-node environments.
- Apply advanced optimization algorithms and their variants (e.g., SGD, Adam, Adagrad) to accelerate learning processes while maintaining model accuracy.
- Lead initiatives in model compression, pruning, and quantization techniques to reduce model footprint and enhance computational efficiency.
- Innovate in knowledge distillation methodologies to transfer learning from larger teacher models to more efficient student models.
- Optimize micro-tuning strategies, such as prompt-based tuning or parameter-efficient tuning methods, to minimize resource requirements.
- Explore and implement mixed precision training and hardware-specific optimizations (e.g., CUDA, Tensor Cores) to leverage hardware acceleration fully.
- Manage hyperparameter tuning processes using automated tools and algorithms to achieve optimal model configurations.
- Collaborate with researchers and engineers to integrate state-of-the-art research into production-ready systems.
Requirements:
- Bachelor degree or higher in Computer Science, Artificial Intelligence, Mathematics, or related fields.
- LLM. AIGC required.