Responsibilities
Develop, test, debug, and troubleshoot cloud-native infrastructure
Develop, build, and maintain CI/CD pipelines and automated testing
Monitor systems availability, latency, and overall health
Provide on-call incident and change management
Requirements
Highly proficient with a solid understanding of Linux system administration and TCP/IP networking protocols
Strong knowledge in scripting languages (e.g., Bash, Python)
Experience with system and container orchestration tools (e.g., Docker, Kubernetes, Helm, GitOps, Operators, Terraform)
Experience with logging monitoring services (e.g., Prometheus, Grafana, OpenTelemetry, Loki, Fluentd)
Experience with AWS services (e.g., EC2, S3, IAM, ECR, EKS)
Comprehensive debugging and troubleshooting skills