We are seeking a Senior SRE at k-ID, you will be instrumental in enhancing the reliability and scalability of our Global Compliance Engine. This role combines software engineering with systems engineering to build and run large-scale, massively distributed, fault-tolerant systems.
Key Responsibilities:
- API Reliability and Performance Optimization: Be a key contributor to the design, implementation, testing, and documentation of our public APIs.
- Develop, scale, and maintain the infrastructure necessary to deliver seamless service to tens of millions of worldwide players.
- Systems Automation and Orchestration: Utilize Kubernetes and AWS to automate deployment, scaling, and management of containerized applications. Enhance our CI/CD pipeline integrating GitOps for streamlined operations across development, testing, and production environments.
- Monitoring and Telemetry: Implement comprehensive monitoring solutions using Prometheus and AlertManager.
- Cross-Functional Collaboration: Work closely with development teams to ensure architectural and operational requirements are incorporated during design and development. Promote a culture of excellence in code health and quality.
- Security: Champion the integration of security best practices within backend architectures to protect sensitive user data against emerging threats.
Requirements
Minimum Requirements:
- 5 years of experience in software engineering with a focus on reliability, performance optimization, and infrastructure management.
- Bachelor’s or master’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
- Expertise in Cloud and Systems Engineering: Extensive experience with AWS, Kubernetes, and modern observability stacks (e.g., Prometheus). Familiarity with CI/CD tools, GitOps practices, and infrastructure as code (e.g., Terraform).
- Performance Monitoring: Proficiency in setting up and managing telemetry and alerting systems, with a strong understanding of best practices in monitoring distributed systems.
Preferred Requirements:
- Willingness to adapt to changing project demands. Experience working in a startup environment is a plus.
- Communicate effectively with remote team members, both written and verbally, providing progress updates,
- flagging potential roadblocks. and fostering positive and productive working relationships.
- Keen interest in automating repetitive tasks and finding innovative solutions to complex technical challenges.