Join Cerebras as a Staff Software Engineer to lead the development of a cutting-edge inference platform for AI applications.
Skills & Technologies
Role intensity
40% coding
AI in the day-to-day
Cerebras offers the fastest Generative AI inference solution, transforming user experience of AI applications.
Requirements
Joblaze summary
In this role, the Staff Software Engineer will lead the development and optimization of the Inference Platform, focusing on the orchestration layer that integrates cloud and machine learning components. Key skills include expertise in distributed systems, particularly with Kubernetes, and a strong background in backend languages like Go or C++. This position is ideal for seasoned engineers with over eight years of experience who are adept at making architectural decisions for high-performance systems. The team is at the forefront of AI innovation, working on cutting-edge technology that significantly enhances inference capabilities.
Joblaze insights
Quick facts
From the original posting
Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.
Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.
Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.
Location: Sunnyvale
We're hiring a Staff Engineer to help lead, drive, and contribute to projects on our Inference Platform team. Our team primarily owns the orchestration layer that runs inference on our datacenter clusters which glues together the cloud components to the ML components. We are often the first team to face issues that haven’t been solved yet so we get to lead the company on a wide variety of solutions, from k8s operators to security policies of services and CI/CD.
This is a hands-on TL role for an engineer who will split their time between design, mentoring, and coding and should be experienced in all facets of development including; testing, continuous development, observability, security, networking, debugging, productionization.
If you're interested in building our next-generation architecture of a globally distributed inference platform, we'd like to talk.
Responsibilities
Skills & Qualifications
People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:
Read our blog: Five Reasons to Join Cerebras in 2026.
Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.
This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.