Design and maintain Kubernetes clusters across multiple environments (development, staging, production)
Build automation for cluster deployment, configuration, and management
Monitor and troubleshoot clusters to ensure high availability and optimal performance
Implement security best practices for Kubernetes and underlying infrastructure
Participate in incident response and work to reduce Mean Time To Recovery (MTTR)
Enhance the reliability and scalability of our Kubernetes infrastructure
Manage CI/CD pipelines and DevOps tooling
Collaborate with development teams on deployment strategies and best practices
Requirements
Infrastructure as Code
- Experience with 2+ IaC tools (Terraform, Pulumi, etc.)
Monitoring & Observability
- Proficiency with Prometheus, Grafana, and related tools
Cloud Platforms
- Hands-on experience with AWS, Azure, or GCP
CI/CD
- Knowledge of GitHub Actions, GitLab CI, or Azure DevOps
Networking & Security
- Understanding of network fundamentals and security best practices
Problem-solving
- Strong analytical and troubleshooting abilities
Communication
- Fluent English for remote asynchronous work
Self-motivated
- Ability to work independently with an agile approach
Nice-to-haves
Experience with GitOps tools (Flux, ArgoCD)
Go programming knowledge or willingness to learn
Active open-source contributions
Experience developing Kubernetes operators or controllers
100% remote work with flexible hours
Work with cutting-edge cloud-native technologies
Contribute to open-source projects
Collaborative, distributed team environment
Opportunity to shape the future of Kubernetes tooling
#J-18808-Ljbffr