As an ML Engineer, your pivotal role involves operationalizing ML Models developed by the data-scientists. You will serve as the focal point for ML model refactoring, optimization, containerization, deployment, and quality monitoring.
Your main responsibilities will include:
- Conduct reviews for compliance of the ML models by overall platform governance principles such as versioning, data/model lineage, and code best practices and provide feedback to data scientists for potential improvements
- Develop pipelines for continuous operation, feedback and monitoring of ML models leveraging best practices from the CI/CD vertical within the MLOps domain. This can include monitoring for data drift, triggering model retraining and setting up rollbacks.
- Optimize AI development environments (development, testing, production) for usability, reliability and performance.
- Have a strong relationship with the infrastructure and application development team to understand the best method of integrating the ML model into enterprise applications (e.g., transforming resulting models
- into APIs).
- Work with data engineers to ensure data storage (data warehouses or data lakes) and data pipelines feeding these repositories and the ML feature or data stores are working as intended.
- Evaluate open-source and AI/ML platforms and tools for the feasibility of usage and integration from an infrastructure perspective. This also involves staying updated about the newest developments, patches and upgrades to the ML platforms in use by the data science teams.
Technical Skills
- Proficiency in Python used both for ML and automation tasks
- Good knowledge of Bash and Unix/Linux command-line toolkit is a must-have.
- Hands-on experience building CI/CD pipelines orchestration by Jenkins, GitLab CI, GitHub Actions or similar tools is a must-have.
- Knowledge of OpenShift / Kubernetes is a must-have.
- Good understanding of ML libraries such as Panda, NumPy, H2O, or TensorFlow.
- Knowledge in the operationalization of Data Science projects (MLOps) using at least one of the popular frameworks or platforms (e.g., Kubeflow, AWS Sagemaker, Google AI Platform, Azure Machine Learning, DataRobot, Dataiku, H2O, or DKube).
- Knowledge of Distributed Data Processing frameworks, such as Spark, or Dask.
- Knowledge of Workflow Orchestrators, such as Airflow or Ctrl-M.
- Knowledge of Logging and Monitoring tools, such as Splunk and Geneos.
- Experience in defining the processes, standards, frameworks, prototypes and toolsets in support of AI and ML development, monitoring, testing and operationalization.
- Experience in ML operationalization and orchestration (MLOps) tools, techniques and platforms. This includes scaling the delivery of models, managing and governing ML Models, and managing and scaling AI platforms.
- Knowledge of cloud platforms (e.g. AWS, GCP) would be an advantage.
Soft Skills
- Good knowledge of DevOps processes and principles
- Strong in Software Engineering fundamentals
- Excellent communication skills
- Attention to detail
- Analytical mind and problem-solving aptitude
- Strong Organizational skills
- Visual Thinking