Devops Engineer (Full-Time)

Details of the offer

AI workloads are brutal—petabytes of data, distributed jobs, and real-time GPU orchestration.
We're building an AI-first DevOps infrastructure that makes compute reliable, scalable, and cost-effective.
If you love infrastructure automation, cloud-native engineering, and AI performance tuning, you'll love this role. What you'll do Design and manage scalable, fault-tolerant AI compute infrastructureAutomate GPU provisioning, multi-cloud scheduling, and scaling strategiesImprove observability, logging, and monitoring for real-time AI workloadsOptimize containerized deployments for Kubernetes, Nomad, or SlurmEnhance security, CI/CD, and cloud networking for high-performance distributed trainingImplement security best practices for DevOps pipelines, including secrets management, infrastructure security, and compliance automationReduce infrastructure cost and maximize performance through automation and tuningWhat we're looking for Deep knowledge of CI/CD pipelines and infrastructure as codeHands-on experience with monitoring and logging tools (Prometheus, Grafana, OpenTelemetry)Proficiency in shell scripting, Python, or Go for automationExperience with security best practices for cloud environments, including IAM, container security, and incident responseNice to haves: Experience managing large-scale clusters with Kubernetes or other approaches and cloud infrastructureExperience with Terraform, Ansible, Helm, or PulumiUnderstanding of AI/ML compute environments (GPUs, CUDA, NCCL, Slurm, Horovod)Our culture We move fast. We ship weekly—new features, improvements, and fixes go live fast. We test big. Every month, we stress test with large groups of users face to face, get real-world feedback, and iterate rapidly.
We build together. On site only, in SF or Sydney. We iterate relentlessly. Direct user feedback shapes our roadmap—we release, test, refine, and keep moving. ? We travel when needed. Engineers may travel between SF and Sydney to run events and meet with clients. Location: SF or Sydney (OG startup house vibe, great food, late nights, all the GPUs) Equipment & Benefits: Top spec Macbook + separate GPU cluster dev environments for each engineer. Weekly cash bonus when you work out 3+ times a week. Comprehensive health benefits, including a choice of Kaiser, Aetna OAMC, and HDHP (HSA-eligible) plans for our SF-based team members. Highest in the world 20 year exercise window for options Don't have all the skills?
Apply anyway!
We're looking for people who move fast, learn fast, and ship fast.
If that's you, let's talk. Want to get to know us first?
Attend one of our upcoming events.

#J-18808-Ljbffr


Nominal Salary: To be agreed

Source: Jobleads

Requirements

Built at: 2025-04-27T09:35:24.978Z