Required Skillset :
- Proficiency in managing logging stack such as Elasticsearch Cluster Management, Logstash, Filebeat, ILM, Kibana, Watchers and Dashboards
- Expertise in Prometheus and exporters, Grafana, Alertmanager, APM
- Knowledge in writing Promql queries and Grafana Dashboards implementation
- Expertise in CI/CD pipeline implementation
- Troubleshooting experience of java ,sprint , react applications. Dev skills would be handy.
- Hands on containerized platforms like Docker/Kubernetes
- Experience with source code control systems like Bitbucket
- Knowledge about High availability, Disaster Recovery and backup.
- Experience with Object Storage Systems like S3
- Providing L1/L2 production support to the systems.
- Experience with Linux systems configurations.
- Experience in Shell scripting and Python.
- Basic understanding of source control concepts, best practices, and the software development lifecycle.
- Strong verbal and written communication skills.
- Critical thinking and problem-solving skills.
Nice to have :
- Added advantage if knowledge on any of the following components Grafana Mimir, Tempo, Loki, Events, Profiling
- Understanding Terraform or Ansible
- Understanding reported vulnerabilities and fixes.
- Knowledge about Opensearch
- Open to explore new tool sets for observability
- Knowledge on PagerDuty and Incident Management