Cube A
← All jobs
OK
Okta

Staff Site Reliability Engineer - (Infra)

Secure Every Identity, from AI to Human Identity is the key to unlocking the potential of AI. Okta secures AI by building the trusted, neutral infrastructure that enables organizations to safely embrace this new era. This work requires a relentless drive to solve complex challenges with real-world stakes. We are looking for builders and owners who operate with speed and urgency and execute with excellence. This is an opportunity to do career-defining work. We're all in on this mission. If you are too, let's talk. What You’ll Be Doing Design, build, and operate highly scalable, reliable, and secure infrastructure powering our production systems across AWS and GCP. Lead major reliability and modernization initiatives, including container platform migrations (e.g., ECS to EKS/GKE) and microservice enablement across multi-cloud environments. Serve as a technical authority in Kubernetes (EKS and GKE), cloud infrastructure (AWS and GCP), and modern CI/CD practices (GitOps, automation pipelines). Partner with development teams to architect and enable microservice-based applications, ensuring production readiness, scalability, and observability. Implement and manage infrastructure as code (Terraform, Ansible) to automate provisioning, scaling, and configuration management across multiple cloud providers. Drive improvements in observability, performance, and cost efficiency through robust monitoring, logging, and alerting systems that span AWS and GCP. Champion SRE best practices — de

More at Okta