- Conduct reviews for compliance of the ML models in accordance with overall platform governance principles such as versioning, data / model lineage, code best practices and provide feedback to data scientists for potential improvements
- Develop pipelines for continuous operation, feedback and monitoring of ML models leveraging best practices from the CI/CD vertical within the MLOps domain. This can include monitoring for data drift, triggering model retraining and setting up rollbacks.
- Optimize AI development environments (development, testing, production) for usability,reliability and performance.
- Have a strong relationship with the infrastructure and application development team in order to understand the best method of integrating the ML model into enterprise applications (e.g., transforming resulting models into APIs).
- Work with data engineers to ensure data storage (data warehouses or data lakes) and data pipelines feeding these repositories and the ML feature or data stores are working as intended.
Technical Skills
- Proficiency in Python used both for ML and automation tasks
- Good knowledge of Bash and Unix/Linux command-line toolkit is a must-have.
- Hands on experience building CI/CD pipelines orchestration by Jenkins, GitLab CI, GitHub Actions or similar tools is a must-have.
- Knowledge of OpenShift / Kubernetes is a must-have.
- Good understanding of ML libraries such as Panda, NumPy, H2O, or TensorFlow.
- Knowledge in the operationalization of Data Science projects (MLOps) using at least one of the popular frameworks or platforms (e.g., Kubeflow, AWS Sagemaker, Google AI Platform, Azure Machine Learning, DataRobot, Dataiku, H2O, or DKube).