Mandatory Skills
Responsibilities:
- Monitoring applications and infrastructure alerts
- Recommending and implementing solutions to mitigate potential production issues
- Taking ownership, responding to incident tickets, and closing open incident tickets by adhering to service level agreements
Requirements:
- 3-5 years of experience managing on premise Kafka Platforms
- Manage Kafka Cluster build, including Design, Infrastructure planning, High Availability and Disaster Recovery in an OpenShift environment.
- Perform day-to-day administration and support functions including capacity Management, performance, utilization, and health of the cluster.
- Undertake Lifecycle Management across the Kafka on premise environments.
- Research and recommend innovative ways to maintain the environment and ensure automation is undertaken.
- Setting up monitoring tools such as SPLUNK, Grafana to provide metrics from various Kafka cluster components (eg., Broker, Zookeeper, Connect, REST proxy, Schema Registry, KSQL)
- Undertake regular assessments of the platform health and stability, create improvement plans and ensure automation/lifecycle management is undertaken.
- Ansible Scripting for automation of Kafka installs hosted on OpenShift.
- Experience in Containerization (OpenShift / Kubernetes).
- Configuring AMQ from an infrastructure perspective, including performance monitoring, integration functions, delivery status and network management.
- Configuring adaptors from an application connectivity perspective, specifically Sterling Integrator B2B and AMQ.
- Handling Messaging features and functions typically used by applications, including ack/nak, recovery and error handling.