x
Get our mobile app
Fast & easy access to Jobstore
Use App
Congratulations!
You just received a job recommendation!
check it out now
Browse Jobs
Companies
Campus Hiring
Download App
Jobs in Singapore   »   Jobs in Singapore   »   Engineering Job   »   Senior Site Reliability Engineering (SRE)
 banner picture 1  banner picture 2  banner picture 3

Senior Site Reliability Engineering (SRE)

Snow Software Singapore Pte. Ltd.

Snow Software Singapore Pte. Ltd. company logo

This role will suit an experienced SRE who can support our drive to improve our SaaS service continuously. The Senior SRE will work very tightly with the technology, product, and development teams to help define our path forward. It means, a lot of freedom and autonomy but also comes with a lot of responsibility. It also means you’re willing to share what you’ve learned by presenting new ideas to the team and the wider engineering organization.

This is a unique opportunity to work with the leading cloud technologies and methodologies as well as being a key player in the definition and implementation of our SaaS offering.


What We Do

We provide our developers with a stable and reliable platform as a product. Our aim is to abstract the complexities of Kubernetes away so that teams can easily create and deploy services into production by just specify the configuration and resources that are required for the application to run. We believe that GitOps is the best way to realize this vision, using tools such as ArgoCD, Terraform, Helm, Kustomize, and Backstage. We are not afraid to evaluate new technologies if it can further improve the developer experience; current technologies we are assessing are Cue, Pulumi, and Crossplane. We also provide our development team with a monitoring stack so that they can effectively monitor metrics and logs from their applications in production. We believe in “You build it, you run it”.

Our Challenge For you

  • Support our initiatives aimed at improving the reliability of our services by providing guidance, engineering solutions and improving our processes.
  • Drive reliability practices across our engineering organization.
  • Provide improvements and best practices targeting observability and predictability.
  • Experiment, learn new things and help grow those around you.
  • Work in short iterations in a lightweight Kanban environment shaped by the team.
  • Participation in an on-call rotation to support our 24x7 service availability
  • Technologies you’ll come in contact with: Microsoft Azure, Terraform, GitHub, Sumologic, Helm, Backstage, ArgoCD, Kubernetes, NATS.

Your Profile & Skills

  • 6+ years of experience managing production environments as SRE, DevOps Engineer or similar.
  • 4+ years of hands-on Kubernetes experience with a proven track record of deploying and managing Kubernetes clusters running microservices in Azure or AWS running on AKS or EKS.
  • At least 2 years of experience with AKS in Azure
  • 4+ years of hands-on experience from previous jobs with infrastructure as code (IaC) and tools used to automate Kubernetes infrastructure in Azure or AWS. This includes experience creating Terraform modules, Helm Charts, and Kubernetes manifests from scratch.
  • Proficient in Golang fundamentals
  • Experience working with SLOs, metrics, incident management in a cloud environment.
  • Passion about reliability engineering practices and automation.
  • Curiosity to learn, explore and collaborate with those around you.

Sharing is Caring

Know others who would be interested in this job?