Responsibilities
· Engage in incident resolution, build mitigation plans, and constantly improve observability of owned applications.
· Design, implement, and manage Kubernetes clusters on platforms like AWS EKS , Rancher etc.
· Provision and configure AWS infrastructure resources (VPCs, subnets, security groups, etc.)
· Automate infrastructure provisioning and configuration using tools like Terraform, Terragrunt.
· Configure and manage clusters including node groups, security groups, and IAM roles
· Deploy and manage containerized applications on platform like EKS, Rancher using Kubernetes manifests (YAML)
· Troubleshoot and resolve issues with Rancher, EKS clusters and deployed applications
· Implement CI/CD pipelines for deploying applications to Rancher, EKS
· Implement, maintain observability platform
· Monitor and optimize Rancher, EKS cluster performance and resource utilization
· Stay up-to-date with the latest AWS and Kubernetes technologies and best practices
· Improve overall developer experience by ensuring that tools and systems used by developers are easy to use, available and working in an efficient manner.
· Evaluate performance trends and expected changes in demand and capacity and establish the appropriate scalability plans.
· Identify and troubleshoot any availability and performance issues at multiple layers of deployment.
· Engage with Product Engineering in design, implementation, and maintenance of the build/release infrastructure.
· Integrate systems and build configurations to drive and innovate around on-prem and public cloud-based platform across the organization.
· Configure and operate the infrastructure of SaaS applications with focus on automation and infrastructure as code.
Tools we use
Java
Python
Github Actions
JFrog
Nexus
SonarQube
Splunk
Honeycomb
AppDynamics
Grafana
Prometheus
InfluxDB
Redis
Kafka
Oracle
Postgress
Kubernetes
Rancher
Istio
Terraform
AWS services
Rundeck
Temporal
LaunchDarkly
PagerDuty
Qualifications expected
· Strong understanding of Kubernetes concepts, including pods, services, deployments, and cluster autoscaler.
· Hands-on experience with Rancher, AWS EKS, including cluster creation, configuration, and management
· Proficiency in scripting languages like Python or Bash for automation
· Experience with infrastructure as code tools like Terraform or CloudFormation
· Knowledge of containerization technologies like Docker and container registries
· Familiarity with CI/CD pipelines and tools like GitHub Actions, GitLab CI/CD, or CircleCI
· Strong troubleshooting and problem-solving skills
· Excellent communication and collaboration skills
Software Engineering
You have knowledge of:
· Kubernetes security best practices
· Advanced Kubernetes concepts like custom resource definitions (CRDs) and operatorssetting up and maintaining CI/CD pipelines
· Monitoring and logging tools like Prometheus, Grafana
· Configuration management using tools like Terraform, Puppet or Ansible.
· You have experience in running production systems utilizing microservices and distributed systems architecture at scale
· You have a background in workload based on cloud-based system with at least one of the leading public cloud platforms (AWS preferred)
Containerization and orchestration
· You know how to build and operate Docker containers – design, construction, and optimization.
· You have experience with defining and managing applications that operate on orchestration platforms.
· You have working experience with service mesh configuration.
Observability and monitoring
· You have experience with configuration and monitoring of an observability tool of choice.
API Gateway Engineering (nice to have)
· You understand common API concepts and standards as well as aspects of data storage, service status and session handling.
· You are familiar with API management system for high availability, resilience, and recovery.
· You know how to deploy, configure, tune, and monitor API Gateways
· You know how to apply API policies and standards for security and standardization of an enterprise.
Additional expectations
· Curiosity and ability to learn new tools.
· Willingness to help others and openness to explain your way of thinking while solving issues.
· Excellent communication skills, both verbal and written
· Experience debugging complex problems.
· English on a level that lets you work efficiently and discuss complex technical topics.
· Proven record of driving changes in DevOps area and/or suggesting company-wide improvements to the existing tools and processes (nice to have)
· Experience in working with people from diverse cultures spread across the globe in different time zones (nice to have)
AWS Certified Kubernetes Administrator or AWS Certified DevOps Engineer - Professional